A Comparative Analysis and Evaluation of Machine Learning Algorithms for Malware Detection

Date

2024-05

Journal Title

Journal ISSN

Volume Title

Publisher

Ashesi University

Abstract

The increasing complexity of malware has become a threat to the security of individuals, businesses, and institutions that operate in cyberspace. New malware variants are regularly being created with obfuscation techniques to steal confidential information and cause harm to users' computers while evading detection. Due to this, malware detection and analysis are critical components of Cybersecurity. This study documents the processes undertaken to perform a comparative analysis and evaluation of the current machine-learning algorithms for malware detection and analysis to determine the most efficient model. Efficiency was measured in terms of accuracy, precision, recall, specificity, f1 score, index of balanced accuracy, and Matthews correlation coefficient. The findings indicate the Random Forest classifier is the most efficient as it outperformed the other algorithms studied. The study also identified factors that enhanced the performance of machine learning models, concluding that feature selection using Recursive feature elimination and handling imbalance in the dataset using Synthetic Minority Oversampling Technique improve model performance.

Description

Undergraduate thesis submitted to the Department of Computer Science, Ashesi University, in partial fulfillment of Bachelor of Science degree in Computer Science, May 2024

Keywords

malware detection, malware, dynamic analysis, static analysis, machine learning, Cybersecurity, synthetic minority oversampling technique, recursive feature elimination

Citation

Amanfu, B. E. (2024). A Comparative Analysis and Evaluation of Machine Learning Algorithms for Malware Detection. Ashesi University.

DOI