Ever thought that you could detect and classify malware by visualizing it? Well, now you can. The researchers at Microsoft and Intel have recently declared the use of the Deep-Learning technique to detect and identify the existence of malicious malware by analyzing the images.
The project is known as STAMINA: Static Malware-as-Image Network Analysis. The newly found technique works on an image-based system. It converts the malware into gray scale images and then scans and analyzes its structural and textural patterns for malware.
The process works by taking the binary form of the input file and converting it into a stream of raw pixel data, which is then converted into a picture. A trained neural network then examines it to check the existence of any infectious element.
ZDNet stated that the AI of STAMINA is based on the Windows Defender Installers collected by Microsoft. It further stated that since the large malware can effortlessly translate into huge images, the technique is not dependent on elaborate pixel-by-pixel reactions of viruses.
Few Limitations of STAMINA
So, far Stamina has been able to detect malware with a success rate of 99.07 percent, and a false positive rate falling under the level of 2.6 percent.
The technique works incredibly well on smaller files but its effectiveness decreases with the larger files. Large files contain a higher volume of pixels that needs higher compression capabilities which are outside the consistent range of Stamina.
To put it in a simple language for you “The effectiveness of results of STAMINA decreases for bigger sized files”.
The Process of Converting a Malware into an Image
As per the researchers at Intel the entire process consists of a few simple steps:
- In the first step take the input file and convert its binary form into raw pixel data.
- The binaries of the input file are then converted into a pixel stream. Each byte of the file is then assigned a pixel intensity. The byte value ranges between 0-255.
- The 1-dimensional pixel data is then converted into a 2D image. The file size defines the width and height of each image.
- The image is then analyzed and studied by the image algorithm and deep neural network of STAMINA.
- The scanning defines if the image is clean or infected by malware strains.
A 2.2m infected Portable Executable file hashes were used as the base of the research by Microsoft. Apart from this, Intel and Microsoft trained their DNN algorithm by using 60% samples of known malware, 20% were deployed to check and validate the DNN and the remaining 20% sample files were used for actual testing.
Microsoft’s recent efforts and investment in machine learning techniques might form the future of malware detection. Based on the success of STAMINA, security researchers are anticipating that the Deep-learning technique will reduce the changes in digital threats and will keep your devices secure in the future.