Classification of malware by using structural entropy on convolutional neural networks

Published in IAAI-AAAI, 2018

The number of malicious programs has grown both in number and in sophistication. Analyzing the malicious intent of vast amounts of data requires huge resources and thus, effective categorization of malware is required.

In this paper, the content of a malicious program is represented as an entropy stream, where each value describes the amount of entropy of a small chunk of code in a specific location of the file. Wavelet transforms are then applied to this entropy signal to describe the variation in the entropic energy.

ramnit_gatak_entropy_families_comparison_small.png

Motivated by the visual similarity between streams of entropy of malicious software belonging to the same family, we propose a deep learning approach for categorization of malware. Our method exploits the fact that most variants are generated by using common obfuscation techniques and that compression and encryption algorithms retain some properties present in the original code. This allows us to find discriminative patterns that almost all variants within a family share.

Recommended citation: Daniel Gibert, Carles Mateu, Jordi Planes. (2018). "Classification of malware by using structural entropy on convolutional neural networks." IAAI-AAAI 2018.
Download Paper