Abstract
Recent research has seen widespread application of transformer-based models, like the vision transformer (ViT), for diverse vision tasks in the medical imaging field. Although ViT performs exceptionally well, it requires a large dataset for optimum result, and its computation with self-attention across image patches has quadratic complexity. However, gathering adequate image samples in medical research is challenging. Also, the computation of a massive number of parameters within the transformer is complex and time intensive. To address these issues, ViT must improve its data efficiency to train on smaller datasets, and its computational complexity should scale linearly with the count of image patches. In response, this paper proposes a novel linear complexity compact convolutional transformer (LC-CCT), designed to train effectively on limited datasets using a convolutional tokenizer, where linear computational scaling is achieved by employing an external attention mechanism. Here, tokens are extracted from images via overlapping convolution to capture local continuity and detailed image features. The LC-CCT framework was validated on three retinal optical coherence tomography (OCT) image datasets, which include Kermany, OCTID, and OCTDL, reaching F1-scores to 96.32%, 99.04%, and 98.11%, with error rate of 3.46%, 0.87%, and 1.89%, respectively. These outcomes suggest that LC-CCT holds significant promise for computer vision tasks in medical imaging research, especially in scenarios where data constraints and rapid processing time are critical.
| Original language | English |
|---|---|
| Pages (from-to) | 204372-204384 |
| Number of pages | 13 |
| Journal | IEEE Access |
| Volume | 13 |
| DOIs | |
| Publication status | Published - 2025 |
Keywords
- Compact transformer
- convolutional tokenizer
- data efficiency
- external attention
- linear complexity
- retinal OCT image
- sequence pooling
Fingerprint
Dive into the research topics of 'LC-CCT: A Linear Complexity Compact Convolutional Transformer for Retinal Disease Detection in Optical Coherence Tomography Images'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver