Enhanced phishing payload detection using fine-tuned DistilBERT and XAI-based NLP models

Datto, Sourav, Tuhin, Delower Hossen, Ahmed, Mustakim, Redwan, Kazi, Al Sohan, Md. Faruk Abdullah, Shufian, Abu, Tamang, Birbal, Lama, Rasmila and Shrestha, Ruja (2026) Enhanced phishing payload detection using fine-tuned DistilBERT and XAI-based NLP models. In: TENCON 2025 - 2025 IEEE Region 10 Conference (TENCON), 27-30 October 2025, Kota Kinabalu, Malaysia.

Abstract

Phishing attacks are a major cybersecurity concern. These attacks continue to grow in complexity and often bypass traditional detection systems by imitating legitimate communication payloads. Many existing models, especially classical machine learning techniques, lack the ability to detect hidden or adversarial phishing payloads. They also offer limited transparency in their predictions. This research presents a phishing payload detection approach using a fine-tuned DistilBERT model. The methodology includes dataset preprocessing, model fine-tuning, adversarial training, explainability analysis, and performance evaluation. DistilBERT, a lightweight transformer model, is finetuned to detect phishing payloads with improved accuracy and robustness. Adversarial training is applied to defend against input manipulation. Explainable AI (XAI) techniques such as LIME and SHAP are used to interpret the model's predictions. This research shows that DistilBERT achieves a classification accuracy of 98.52% and an AUC score of 0.9993, outperforming traditional machine learning models. It also maintains low false positives and high recall. This research improves the reliability of phishing detection and provides interpretable outputs for security analysis. The results demonstrate that the proposed framework strengthens phishing detection strategies and increases resilience to adversarial attacks. The results are based on a single publicly available phishing email dataset and further validation across diverse datasets and real-world environments is required, with the scope of the findings limited to email-based phishing detection.

Details
Record
View Item View Item