parent menu
iacs CAI

Details

Cover Vol. 3 No. 2 (2025)

ARTICLE

A Hybrid Malware Detection Framework Utilizing Natural Language Processing and Surface Analysis Features

Abstract

The escalating frequency of malware attacks necessitates the development of robust detection models, predominantly relying on features derived from surface analysis and machine learning. While prior research in surface analysis has established image-based methods via ensemble learning, there remains a significant deficiency in natural language processing (NLP) methodologies that effectively integrate multiple features. Existing NLP-based detection schemes typically utilize singular features, as the amalgamation of hybrid features into a unified data point disrupts word sequence integrity, thereby impeding detection accuracy. Addressing this gap, this paper introduces a novel hybrid model that leverages three distinct features obtained through surface analysis for malware identification. This study validates the efficacy of applying NLP techniques in conjunction with hybrid features, overcoming previous sequential data limitations. Empirical results demonstrate the superior performance of this combined approach, achieving an F-measure of 0.927.