El Mouhtadi, Walid, El Bakkali, Mohamed, Maleh, Yassine, Mounir, Soufyane and Ouazzane, Karim (2024) Refining malware analysis with enhanced machine learning algorithms using hyperparameter tuning. International Journal of Critical Computer-Based Systems, 11 (1/2). pp. 48-67. ISSN 1757-8787
Many researchers address challenges and limitations inherent to machine learning algorithms to optimize classifier performance. Overfitting, a prevalent issue, arises when models are excessively complex and trained on noisy data, leading to suboptimal generalization to new data. Another concern is underfitting, where models are overly simplistic and fail to capture data complexity. This comprehensive investigation focuses on machine learning's application to malware classification, specifically targeting PE files. The study addresses these limitations using ensemble methods and pre-processing techniques, including feature selection and hyperparameter tuning. The primary objective is to augment classifier performance. Through a comparative study that aims to classify PE files as malicious or benign through analysis of machine learning methodologies such as random forests, decision trees, and gradient boosting, the study highlights the superiority of the random forests algorithm, achieving a remarkable 99% accuracy rate. Thoroughly assessing the strengths and limitations of each algorithm provides valuable insights into effectively handling diverse malware categories. This paper underscores the significance of ensemble methods, feature engineering, and pre-processing in enhancing classifier performance.
View Item |