Please use this identifier to cite or link to this item:
http://bura.brunel.ac.uk/handle/2438/31337
Title: | Advancing medical image segmentation & generalization by capturing global context & mitigating negative knowledge transfer across multi-source data |
Authors: | Nchongmaje, Ndipenoch |
Advisors: | Miron, A Li, Y |
Keywords: | Deep Learning;Convolutional Neural Network (CNN);Vision transformer (ViT);Multimodal data;Domain Adaptation |
Issue Date: | 2024 |
Publisher: | Brunel University London |
Abstract: | Deep learning methods have shown significant success in detecting and segmenting diseases or pathogens in medical images. However, most of these models are trained and tested on data from the same source, resulting in poor generalizability when applied to unseen data, as often encountered in real-world scenarios. This challenge is primarily due to the domain shift problem, which occurs when there is a discrepancy in data distributions between the source (training) domain and the target (testing) domain. This shift often occurs because medical images are collected from diverse sources, modalities, and vendor machines, with varying scanning protocols and expertise levels among radiologists and annotators. Furthermore, deep learning models typically require large, annotated datasets for training. Given that annotating medical images is labor-intensive and time-consuming, the size of available datasets is often limited. While numerous small, annotated datasets exist across various medical domains, directly combining them can introduce another issue known as Negative Knowledge Transfer (NKT), where knowledge from one domain negatively impacts performance in another, particularly in multi-domain training. This research aims to address these challenges by proposing the integration of Atrous Spatial Pyramid Pooling (ASPP) and Squeeze-and-Excitation (SE) blocks to capture global contextual information in the case of specific designed architectures, and knowledge transfer and domain adapters to mitigate negative knowledge transfer in the case of diverse, multi-source data. These enhancements improve the model’s segmentation and generalization performance. Three key contributions are presented: 1)Enhancing Retinal Disease Detection, Segmentation, and Generalization with an ASPP Block and Residual Connections Across Diverse Data Sources: We propose a novel algorithm nnUNet RASPP, an enhanced variant of nnU-Net that incorporates an Atrous Spatial Pyramid Pooling (ASPP) block immediately after the input layer to capture global contextual information, as well as residual connections to mitigate the vanishing gradient problem, thereby improving the model’s generalizability across data from diverse sources (collected using three different manufacturer devices). Additionally, we conducted a performance evaluation of the top teams in the RETOUCH challenge, highlighting the different architectures employed. Experimented on the RETOUCH Grande Challenge dataset, and evaluation results on the hidden test set show that nnUNet RASPP outperformed the baseline nnU-Net and state-of-the-art models by a clear margin. Also, nnUNet RASPP is the current winner of both the online and offline phases of the competition. Additionally, nnUNet RASPP demonstrated strong generalization on unseen datasets. 2) Dynamic Network for Global Context-Aware Disease Segmentation in Retinal Images Using Multiple ASPP and SE Blocks: We further explore the potential of using multiple ASPP blocks at various locations, along with Squeeze-and-Excitation (SE) blocks, within a dynamic convolutional neural network (CNN) architecture that can automatically adjust the kernel size and depth of the network based on input size. We propose a novel algorithm, Deep ResUNet++, a dynamic CNN model that incorporates multiple ASPP and SE blocks to capture global contextual information for disease segmentation in 2D B-Scans. The use of multiple ASPP and SE blocks offer a more detailed and effective method for feature extraction, context aggregation, and feature recalibration. Deep ResUNet++ was evaluated on two public datasets, the AROI and Duke DME datasets, outperforming state-of-the-art algorithms by a clear margin. 3) Enhancing Medical Image Segmentation Through Knowledge Transfer with Domain-Specific Adapters Across Diverse Data Sources: To further enhance model generalizability, we aim to leverage the synergistic potential of multiple datasets to create a single, diverse model trained on data from various sources, covering multiple modalities, organs, and disease types, collected with different device vendors and protocols. To mitigate negative knowledge transfer, we incorporate domain knowledge adapters into the network architecture. We propose two novel algorithms: (i) MMIS-Net (MultiModal Medical Image Segmentation Network), which addresses label inconsistencies through a one-hot label space and employs a similarity fusion block for multi-source medical image segmentation. And (ii) CVD Net (Convolutional Neural Network and Vision Transformer with Domain-Specific Batch Normalization), which integrates Vision Transformers and CNNs with domain-specific batch normalization to improve generalization. Both algorithms were evaluated on two dataset groups. The first group, comprising 10 benchmark datasets from the Medical Segmentation Decathlon (MSD) and the RETOUCH, challenge benchmark and the second group, is the HECKTOR challenge benchmark dataset. Experimental results on the hidden test sets show that both algorithms outperformed state-of-the-art algorithms and large foundation models for medical image segmentation by a clear margin, demonstrating superior generalization on new, unseen data. In summary, this research introduces techniques to enhance model segmentation performance and generalizability by integrating Atrous Spatial Pyramid Pooling (ASPP) and Squeeze-and-Excitation (SE) blocks for capturing global contextual information in specific designed models and domain-adaptive adapters to mitigate negative knowledge transfer on diverse, multi-source data. These methods not only improve model generalization on new, unseen data but also set new benchmarks in medical image segmentation, providing robust and generalizable solutions for realworld clinical applications. |
Description: | This thesis was submitted for the award of Doctor of Philosophy and was awarded by Brunel University London |
URI: | http://bura.brunel.ac.uk/handle/2438/31337 |
Appears in Collections: | Computer Science Dept of Computer Science Theses |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
FulltextThesis.pdf | 27.14 MB | Adobe PDF | View/Open |
Items in BURA are protected by copyright, with all rights reserved, unless otherwise indicated.