A multi-resolution self-supervised learning framework for semantic segmentation in histopathology
Journal Publication ResearchOnline@JCUAbstract
Modern whole slide imaging technique together with supervised deep learning approaches have been advancing the field of histopathology, enabling accurate analysis of tissues. These approaches use whole slide images (WSIs) at various resolutions, utilising low-resolution WSIs to identify regions of interest in the tissue and high-resolution for detailed analysis of cellular structures. Due to the labour-intensive process of annotating gigapixels WSIs, accurate analysis of WSIs remains challenging for supervised approaches. Self-supervised learning (SSL) has emerged as an approach to build efficient and robust models using unlabelled data. It has been successfully used to pre-train models to learn meaningful image features which are then fine-tuned with downstream tasks for improved performance compared to training models from scratch. Yet, existing SSL methods optimised for WSI are unable to leverage the multi-resolutions and instead, work only in an individual resolution neglecting the hierarchical structure of multi-resolution inputs. This limitation prevents from the effective utilisation of complementary information between different resolutions, hampering discriminative WSI representation learning. In this paper we propose a Multi-resolution SSL Framework for WSI semantic segmentation (MSF-WSI) that effectively learns histopathological features. Our MSF-WSI learns complementary information from multiple WSI resolutions during the pre-training stage; this contrasts with existing works that only learn between the resolutions at the fine-tuning stage. Our pre-training initialises the model with a comprehensive understanding of multi-resolution features which can lead to improved performance in the subsequent tasks. To achieve this, we introduced a novel Context-Target Fusion Module (CTFM) and a masked jigsaw pretext task to facilitate the learning of multi-resolution features. Additionally, we designed Dense SimSiam Learning (DSL) strategy to maximise the similarities of image features from early model layers to enable discriminative learned representations. We evaluated our method using three public datasets on breast and liver cancer segmentation tasks. Our experiment results demonstrated that our MSF-WSI surpassed the accuracy of other state-of-the-art methods in downstream fine-tuning and semi-supervised settings.
Journal
Pattern Recognition
Publication Name
N/A
Volume
N/A
ISBN/ISSN
1873-5142
Edition
N/A
Issue
N/A
Pages Count
35
Location
N/A
Publisher
Elsevier
Publisher Url
N/A
Publisher Location
N/A
Publish Date
N/A
Url
N/A
Date
N/A
EISSN
N/A
DOI
10.1016/j.patcog.2024.110621