Search This Blog

Powered by Blogger.

Blog Archive

Labels

Google's Magika: Revolutionizing File-Type Identification for Enhanced Cybersecurity

Explore how Google's Magika is transforming file-type identification, revolutionizing cybersecurity for enhanced user protection.

 

In a continuous effort to fortify cybersecurity measures, Google has introduced Magika, an AI-powered file-type identification system designed to swiftly detect both binary and textual file formats. This innovative tool, equipped with a unique deep-learning model, marks a significant leap forward in file identification capabilities, contributing to the overall safety of Google users. 

Magika's implementation is integral to Google's internal processes, particularly in routing files through Gmail, Drive, and Safe Browsing to the appropriate security and content policy scanners. The tool's ability to operate seamlessly on a CPU, with file identification occurring in a matter of milliseconds, sets it apart in terms of efficiency and responsiveness. 

Under the hood, Magika leverages a custom, highly optimized deep-learning model developed and trained using Keras, weighing in at a mere 1MB. During inference, Magika utilizes the Open Neural Network Exchange (ONNX) as an inference engine, ensuring rapid file identification, almost as fast as non-AI tools, even on the CPU. Magika's prowess was tested in a benchmark involving one million files encompassing over a hundred file types. 

The AI model, coupled with a robust training dataset, outperformed rival solutions by approximately 20% in performance. This heightened performance translated into enhanced detection quality, especially for textual files such as code and configuration files. The increase in accuracy enabled Magika to scan 11% more files with specialized malicious AI document scanners, significantly reducing the number of unidentified files to a mere 3%. 

Magika showcased a remarkable 50% improvement in file type detection accuracy compared to the prior system relying on handcrafted rules. For users keen on exploring Magika, the tool is available through the Magika command line tool, enabling the identification of various file types. 

Interested individuals can also access the Magika web demo or install it as a Python library and standalone command line tool using the standard command 'pip install Magika.' The code and model for Magika are freely available on GitHub under the Apache2 License, fostering an environment of collaboration and transparency. 

The journey doesn't end here for Magika, as Google envisions an integration with VirusTotal. This integration aims to bolster the platform's existing Code Insight feature, which employs generative AI to analyze and identify malicious code. Magika's role in pre-filtering files before they undergo analysis by Code Insight enhances the accuracy and efficiency of the platform, ultimately contributing to a safer digital environment. 

In the collaborative spirit of cybersecurity, this integration with VirusTotal underscores Google's commitment to contributing to the global cybersecurity ecosystem. As Magika continues to evolve and integrate seamlessly into existing security frameworks, it stands as a testament to the relentless pursuit of innovation in safeguarding user data and digital interactions.
Share it:

AI technology

Cyber Security

Google

Google Security Tools

VirusTotal