SkinViT-EfficientX: a hybrid vision transformer model with token pruning and explainable AI for multiclass skin cancer diagnosis

Rahman Shakil, Mostafizur, Rahman, Mahfuzur, Jahan Meem, Erin, Imranul Hoque Bhuiyan, Md, Akter, Sanjida, Bin Mohiuddin, Arafath, Rahman, Shafiur and Kabir, Istiak (2026) SkinViT-EfficientX: a hybrid vision transformer model with token pruning and explainable AI for multiclass skin cancer diagnosis. In: 2025 IEEE International Conference on Biomedical Engineering, Computer and Information Technology for Health (BECITHCON), 29-30 November 2025, Dhaka, Bangladesh.

Abstract

Skin cancer is a common and serious health issue, making early diagnosis crucial for better outcomes. Traditional manual dermoscopy can be slow and inconsistent, demonstrating a need for automated diagnostic tools. This study introduces SkinViT-EfficientX, a hybrid deep learning model specifically designed for classifying skin lesions. It utilizes an EfficientNetV2-S encoder and a lightweight Vision Transformer connected by a residual cross-attention mechanism for effective local-global feature extraction. To enhance performance, a confidence-guided token pruning strategy is employed, and Grad-CAM is used for class-specific visual explanations. The model underwent thorough preprocessing and augmentation on two benchmark datasets: HAM10000 and the combined ISIC 2019 + DermNet dataset. SkinViT-EfficientX achieved a 97.36% F1-Score, 95.64% MCC, and 97.93% Specificity on HAM10000, while scoring 98.42% F1-Score, 96.51% MCC, and 98.86% Specificity on the combined dataset. It outperformed top models like MaxViT, Swin V2-T, DeiT III-S, and MobileViT V2-S in all metrics. The model's robustness and stability for rare lesion classes were validated through confusion matrix and learning curve analyses. Further, it is integrated into a web application for dermoscopic image uploads, class predictions, and heatmap visualizations. SkinViT-EfficientX provides an efficient, accurate, and interpretable AI-driven solution for skin cancer screening.

Details
Record
View Item View Item