An Empirical Study of Artifacts and Security Risks in the Pre-trained Model Supply Chain
This is a brief for the research paper “An Empirical Study of Artifacts and Security Risks in the Pre-trained Model Supply Chain”, published at SCORED 2022. The full paper is available here, with code and data available here. The paper was led by my student Wenxin Jiang. He also wrote this post, which I have lightly edited.
In this article, we use the word “PTM” as shorthand for “pre-trained model”.
Summary
Deep Neural Networks (DNNs) are widely used, from autonomous vehicles [1] to intrusion detection[2]. Anyone who has worked with this technology knows that making and training a DNN is challenging! Engineers can address some of these problems by reusing a PTM and fine-tuning it for their own tasks. To facilitate software reuse, engineers collaborate around model hubs: collections of PTMs and datasets organized by problem domain. Model hubs are now comparable in popularity and size to other software ecosystems (e.g., Npm, PyPI). However, the associated PTM supply chain has not yet been examined from a software engineering perspective.
In this work, we present an empirical study of artifacts and security features in 8 model hubs. We indicate the potential threat models and show that the existing defenses are insufficient for ensuring the security of PTMs. We compare PTM and traditional supply chains and propose directions for further measurements and tools to increase the reliability of the PTM supply chain.
Our results introduce three types of model hubs (open, gated, and commercial), with varying security properties by type. We summarize two threat models and measure potential risks in the form of model discrepancies and maintainer reach. We observe two main differences between the PTM supply chain and the traditional software supply chain: versioning and security properties.
Our contributions are:
- We measure the artifacts in 8 model hubs, identify their typical structures, and depict the PTM supply chain.
- We indicate the security features on different model hubs and summarize the threat models.
- We show the versioning and security risk differences between PTM supply chains and traditional supply chains.
Background
Re-use of Pre-Trained Models (PTMs)
Neural networks are expensive to develop and train, so it makes sense that software engineers would try to re-use them when possible. This figure shows different methods for PTM reuse, including model compression, transfer learning, dataset labeling, and knowledge distillation:
Initially, a PTM provider trains a DL model on a dataset to create a model checkpoint. This PTM can be reused for either the same task or a novel application.
Security Risks on and with PTMs
Prior work shows that there can be adversarial attacks on and with PTMs (see previous figure). Some attacks directly target PTMs — they may modify the models’ input/output behavior via backdoor/Trojan attack, or add side effects by injecting malware into the PTMs during training or inference. Other attacks, called data poisoning attacks, impact a model indirectly through its training dataset.
Beyond these PTM-specific attacks, PTMs involve a lot of traditional software components and concepts from re-use. Vulnerabilities derived from this aspect are called software supply chain vulnerabilities.
Our work builds on knowledge from the traditional software supply chain, and we determine their characteristics in the PTM context.
Model hubs
PTM re-users gain access to PTMs in different ways. Some PTMs are simply posted to GitHub. Others are available for purchase or through an API. However, one of the most common ways to access a PTM echoes re-use practice in traditional software engineering: model hubs that offer PTMs as well as packaging and metadata such as the author.
Hugging Face (HF) is the largest such model hub. At time of writing it had 60,904 public PTMs. As shown in Figure 1, the most popular PTMs in HF are downloaded at rates comparable to the popular packages in NPM and PyPI. Despite a lot of use, deep learning model hubs are in their infancy (the earliest came out in 2018). There has been no systematic investigation of the artifacts and security risks in the PTM supply chain.
Research Questions
The characteristics and practices of secure PTM supply chains have not been studied before. Prior work either focuses on traditional software supply chain attacks, or only considers deep learning frameworks like PyTorch and TensorFlow rather than the models produced with them.
To characterize the PTM supply chain, we ask:
- What is the typical structure of model hubs?
- What practices are in place to improve security among users of the model hubs?
- What are the potential threats in model hubs?
Results & Discussion
Model hubs
To find existing model hubs, we use keywords (i.e., machine learning model hub, deep learning model hub) in a major search engine. We apply three criteria to the initial results: model hub definition, website, and documentation accessibility. We identified 8 model hubs that fit our criteria.
Model hub types and contribution workflow
We divide these hubs into three different types: open, gated, and commercial. The types are determined by how the PTMs are contributed. For example, in HuggingFace, anyone can upload and publish a PTM. For TensorFlow Hub, only those PTMs approved by the TensorFlow team will be published there. In commercial model hubs, the PTMs are only offered by internal engineers.
Measuring risks
We evaluated two potential risks: maintainer reach and model discrepancies.
Risk #1: Maintainer reach: If a maintainer’s credentials are compromised, the PTMs and datasets that they control can be modified maliciously. The more artifacts they control, the greater the risk to their users. The next figure shows that, within the Hugging Face model hub, a small number of maintainers have access to a disproportionate number of repositories. By compromising one of these accounts, attackers could influence hundreds of models and thus do harm to the PTM supply chain.
Risk #2: Model discrepancies: Ideally, a PTM’s documentation would correctly describe its performance. That way, a user could check the performance of the model they download, and determine whether the model was somehow corrupted (e.g. turned into an EvilNet or a BadNet) at some point during distribution.
We checked the correctness of the performance claims made in PTM documentation. Firstly, we were surprised to find that many PTMs actually include no check-able claims. Of the models we examined, 8/53 object detection models, 4/26 image classification models, and 136/160 sentiment analysis models made claims we could validate. Among these, we were unable to reproduce the documented performances of a large number of PTMs, as shown in the next figure. Some of the models we looked at had a difference of more than 5% accuracy, even when they were from major technology companies (e.g., Facebook/Meta). Our result shows the presence of performance discrepancies which could either decrease the users’ awareness or the detectability of known PTM attacks.
(For a cross-hub comparison, see the paper from my student Diego Montes at ESEC/FSE-IVR’22 here).
Implications
We highlight future works in two directions:
Empirical study
The security characteristics of the PTM supply chain remain under-discovered. We call for expanding our knowledge of traditional supply chain management to encompass DL software supply chains.
Automated tools
- Model audit: checking different PTM behavior is a necessary approach to validate existing PTMs.
- Model scanning: specific integrated scanning tools for PTMs are needed for improving the security of model hubs.
Conclusions
The PTM supply chain will be a critical component of software engineering over the next decade. We examined 8 model hubs, proposed three model hub types, and summarized the PTM supply chain. Our data show there is substantial room for improvement in securing these supply chains.
References
[1] Hironobu Fujiyoshi, Tsubasa Hirakawa, and Takayoshi Yamashita. 2019. Deep learning-based image recognition for autonomous driving. IATSS Research 43 2019, 244–252.
[2] Du, Min, et al. “Deeplog: Anomaly detection and diagnosis from system logs through deep learning.” Proceedings of the 2017 ACM SIGSAC conference on computer and communications security. 2017.