Thousands of malicious models uploaded by hackers to the largest online repository for AI

Hugging Face, the popular online repository for generative AI models, has been found to harbor thousands of models containing hidden code that can steal information and compromise data security, according to security researchers. Security startups ProtectAI, Hiddenlayer, and Wiz have been warning for months that malicious models have been uploaded to Hugging Face’s site, which now hosts over a million models available for download. ProtectAI’s CEO Ian Swanson stated that they found over 3,000 malicious models on the platform earlier this year, highlighting the evolving threats in the AI era.

Some malicious actors have gone as far as setting up fake profiles on Hugging Face, posing as reputable companies like Meta or Facebook, to trick users into downloading their models. For example, a fake model claiming to be from genomics testing startup 23AndMe had been downloaded thousands of times before its true nature was discovered. The hidden code in these fake models could silently collect sensitive information like AWS passwords, putting users at risk of having their cloud resources stolen. Hugging Face has responded by deleting such malicious models and integrating ProtectAI’s scanning tool into its platform to alert users of potential risks before downloading.

The company has taken steps to verify the profiles of major companies like OpenAI and Nvidia on its platform to enhance security and trust for users. They began scanning files used to train machine learning models for unsafe code in November 2021, and hope that their partnership with ProtectAI and others will facilitate safer sharing and adoption of machine learning artifacts. The collaboration comes in response to a joint warning issued by the United States’ Cybersecurity and Infrastructure Security Agency and security agencies in Canada and Britain in April, urging businesses to scan pre-trained models for dangerous code to mitigate risks.

Hackers targeting Hugging Face typically insert malicious instructions into the code that developers download, enabling them to hijack the model when it is run by unsuspecting users. These attacks, although classic in nature, are now hidden within models and can be difficult to detect or trace back to the perpetrators. The need for heightened security measures in the AI field has become apparent as the popularity of platforms like Hugging Face continues to grow. The platform, which was last valued at $4.5 billion, has transitioned from a teenage-focused chatbot app to a hub for machine learning research and collaboration since its founding in 2018.

In response to the increasing threats posed by malicious models, Hugging Face has implemented measures to enhance security on its platform, including scanning for unsafe code and verifying the profiles of reputable companies. The platform’s founders are committed to improving trust and safety for AI researchers and users alike, recognizing the importance of protecting sensitive data and resources in the evolving landscape of artificial intelligence. As the industry continues to evolve, it is crucial for companies like Hugging Face to prioritize cybersecurity and collaborate with experts to address emerging threats in the field.