Skip to main navigation menu Skip to main content Skip to site footer

Articles

Vol. 5 No. 1 (2026): Beyond Algorithms: The New Era of AI

Skin Cancer Detection on Smartphone Images through Knowledge Distillation of Multimodal Deep Learning Models

Submitted
December 31, 2024
Published
2026-05-31

Abstract

Skin cancer remains a critical global health challenge, particularly in regions with limited access to specialized diagnostic tools. This study presents an innovative approach for skin lesion classification using non-dermoscopic smartphone images, leveraging knowledge distillation to enhance model efficiency and accuracy. We utilize the PAD-UFES-20 dataset, which comprises 2,298 smartphone-captured images of six distinct skin lesion types, accompanied by comprehensive patient metadata. Our methodology involves a teacher–student framework, where a ConvNeXt-based teacher model integrated with Convolutional Block Attention Modules (CBAM) and a metadata encoder transfers its learned representations to a more compact EfficientNet-B0 student model. The distillation process incorporates logit matching, feature similarity, and attention transfer, enabling the student model to achieve performance parity with the teacher while significantly reducing computational overhead. Experimental results demonstrate that the student model attains an accuracy of 80.43% and a weighted F1-score of 80.16%, closely mirroring the teacher's performance. Additionally, the integration of metadata and attention mechanisms substantially improves classification robustness, particularly for underrepresented lesion categories. The proposed framework effectively addresses class imbalance through the application of focal loss, enhancing the model's ability to detect clinically significant but less frequent skin lesions. This approach offers a viable solution for deploying accurate skin cancer diagnostic tools on resource-constrained mobile devices, thereby expanding access to essential healthcare services in underserved communities.