March 11, 2026 · 13 min read

AI Supply Chain Attacks: How Poisoned Models and Packages Reach Production

AI Supply Chain Attacks: How Poisoned Models and Packages Reach Production

The software supply chain security crisis of the last several years - SolarWinds, Log4Shell, the XZ backdoor - trained organizations to scrutinize what software they import and where it comes from. The AI supply chain presents the same problem at larger scale and with less mature tooling.

AI supply chain attacks target the components that organizations don’t build themselves: pre-trained model weights downloaded from community repositories, Python packages that wrap ML frameworks, fine-tuning datasets scraped from public sources, third-party embeddings models, and plugins that extend AI agents. Each is a potential attack vector. Unlike traditional software supply chain attacks, the harm isn’t always immediately visible - a backdoored model may behave perfectly normally except when triggered by a specific input pattern that an attacker controls.


Anatomy of the AI Supply Chain

Understanding the attack surface requires mapping where external components enter the AI development and deployment process.

Layer 1: Foundation Models

The base model - GPT-4, Llama, Mistral, Gemini - is the foundation. For organizations using API-based models, this layer is managed by the provider and largely outside the organization’s direct control (though the provider’s security posture matters). For organizations using open-weight models, the model weights must be downloaded from a source, and that download chain is an attack surface.

Risk: A poisoned base model that behaves normally on most inputs but produces manipulated outputs for specific trigger inputs. This is not hypothetical - backdoored models in computer vision have been extensively demonstrated in academic research, and the techniques transfer to language models.

Layer 2: Fine-Tuning and Adaptation

Organizations fine-tune base models on proprietary or curated data. This introduces two attack surfaces:

Fine-tuning dataset: If any portion of the training data comes from external sources - scraped web content, third-party datasets, user-generated content - an attacker who can influence what content appears in those sources can influence model behavior.

Fine-tuning infrastructure: The training pipeline itself - including the ML frameworks used, the compute infrastructure, and the training orchestration - is an attack surface. Malicious code in a training dependency runs with the elevated privileges of the training process.

Layer 3: ML Packages and Dependencies

The Python ecosystem for machine learning is vast and poorly secured. Organizations install hundreds of ML-adjacent packages - data loading utilities, model evaluation libraries, tokenizers, quantization tools, serving frameworks - each of which is a potential attack vector.

PyPI ecosystem: The same typosquatting, dependency confusion, and account compromise attacks that affect general Python packages apply to ML packages. An ML-specific package downloaded by thousands of data scientists and ML engineers is a high-value target.

Conda ecosystem: Similar attack surface, with the additional complexity of multi-language package management and less mature security scanning tooling compared to PyPI.

Layer 4: Model Distribution Platforms

Hugging Face, Kaggle, Model Zoo, and similar platforms host hundreds of thousands of model checkpoints. Anyone can publish a model. Content moderation exists but is not comprehensive.

Model checkpoint security: The dominant format for sharing PyTorch models has historically been Python’s native serialization format, which allows arbitrary code execution on load. A model checkpoint can contain both valid model weights and malicious code that executes when the model is loaded.

Quantized model variants: Organizations frequently download quantized versions of popular models (GGUF, GPTQ, AWQ formats) to run locally with reduced compute requirements. These quantized versions are often community-produced conversions, not official releases - creating opportunity for substitution attacks.

Layer 5: Plugins, Tools, and MCP Servers

Modern AI agents extend their capabilities via tool integrations - REST API wrappers, database connectors, file system utilities, code execution environments. The Model Context Protocol (MCP) has emerged as a standard for this integration, and a growing ecosystem of third-party MCP servers is being published to package registries.

MCP server supply chain: An MCP server runs with the permissions granted to it - which may include filesystem access, network access, or access to other services. A malicious MCP server masquerading as a legitimate utility has direct access to these capabilities.

Layer 6: Inference and Serving Infrastructure

Model serving frameworks (vLLM, TGI, Triton), containerized deployment pipelines, and cloud-based inference endpoints each introduce dependencies and configuration surfaces.


Real-World Attack Cases

Case 1: Hugging Face Malicious Model Checkpoints

In 2024, security researchers discovered thousands of model checkpoints on Hugging Face that contained embedded code in their serialized format. When loaded using the default PyTorch loading function, these checkpoints executed arbitrary code on the loading machine. The malicious code ranged from benign proof-of-concept payloads to reverse shells.

What made this possible: The dominant serialization format in PyTorch is based on Python’s object serialization mechanism, which was designed for Python data structures - not as a secure format for untrusted model artifacts. Loading a model checkpoint in this format is equivalent to running arbitrary Python code.

Impact: Organizations that downloaded and loaded model checkpoints from Hugging Face without verifying their integrity were exposed to code execution on their training servers, development machines, or inference infrastructure - with whatever privileges the loading process held.

Remediation that works: The safetensors format, developed specifically to address this vulnerability, stores only tensor data and cannot execute code on load. Hugging Face has been promoting safetensors as the default format, but the ecosystem has not fully migrated. Always specify safetensors format when downloading model weights. Verify checksums against official releases.


Case 2: PyPI Typosquatting Targeting ML Engineers

Multiple documented cases of typosquatting attacks on ML-adjacent PyPI packages have targeted the specific package names that ML engineers commonly install:

  • torchvison (vs torchvision) - typosquatted package with data exfiltration payload
  • huggingface-hub vs hugging-face-hub - namespace variants
  • transformers variants targeting early CLI users

These packages typically install without error, function normally (by importing and re-exporting the legitimate package), and simultaneously execute malicious code - credential theft, environment variable exfiltration, cryptocurrency mining, or establishing persistence.

What makes ML engineers specifically vulnerable: ML projects often involve rapid environment setup from tutorials or notebooks. Copy-pasting pip install commands from untrusted sources, running setup commands from GitHub repositories without reviewing them, and working in shared Jupyter environments all increase exposure.

Remediation: Use dependency pinning with hash verification. Implement allowlists for approved packages in corporate environments. Scan new packages with tools like pip-audit before installation. Use private package mirrors with curated allowlists for production ML environments.


Case 3: Dependency Confusion in ML Infrastructure

Dependency confusion attacks exploit the way Python’s package managers resolve package names: if a package exists on a public registry (PyPI) with the same name as an internal private package, the package manager may install the public (attacker-controlled) version instead of the private (legitimate) version.

Several organizations have disclosed internal ML infrastructure components exposed to this attack. A private package named acme-ml-utils (for internal ML tooling) can be targeted by publishing acme-ml-utils on PyPI before the organization thinks to reserve the namespace. Any CI/CD pipeline or developer machine that installs this package from a default registry configuration installs the attacker’s package.

Remediation: Reserve namespace on public registries for all internal package names. Use explicit registry scoping in package manager configuration. Implement version pinning with integrity verification.


Case 4: Poisoned Fine-Tuning Datasets

Academic research has demonstrated data poisoning attacks on fine-tuned models where a small number of adversarially crafted examples (as few as 0.1% of the training corpus) reliably introduce backdoor behaviors. The poisoned model behaves normally on clean inputs but produces attacker-controlled outputs when a trigger is present.

Real-world relevance: Organizations fine-tuning models on curated datasets that include web-scraped content, open datasets, or community-contributed data are potentially exposed. The attack requires an adversary who can contribute content to the training corpus - which is feasible for publicly accessible data sources.

Detection challenge: Backdoored models are designed to look clean. Standard model evaluation on clean benchmarks will not detect the backdoor. Detecting data poisoning requires adversarial evaluation - specifically testing for behavioral anomalies in response to potential trigger patterns.


Building an AI SBOM

A Software Bill of Materials for AI systems - an AI SBOM - tracks all components that contribute to an AI system’s behavior, analogous to traditional SBOMs for software dependencies.

What an AI SBOM Should Include

Foundation model provenance:

  • Model family, version, and variant (e.g., Llama-3-70B-Instruct)
  • Source (official provider, Hugging Face repository, local fine-tune)
  • Download date and source URL
  • File hash (SHA-256) of all model weight files
  • Verification against official published checksums

Fine-tuning record:

  • Base model (as above)
  • Training dataset names, versions, and sources
  • Training framework and version
  • Training infrastructure (reproducible hash if using containerized training)
  • Fine-tune date and experimenter

ML package dependencies:

  • Full dependency graph including transitive dependencies (pip freeze or conda list)
  • Package hashes (pip install –require-hashes)
  • Source registry for each package
  • Date of dependency resolution

Plugin and tool inventory:

  • All MCP servers and plugins integrated with the agent
  • Version, source, and author for each
  • Permissions granted to each plugin
  • Date of integration

Embedding models:

  • Embedding model identity and version
  • Source and checksum
  • Date of deployment

AI SBOM Maintenance

The AI SBOM must be updated at three trigger points:

  1. Model updates - when base model, fine-tune, or embedding model changes
  2. Dependency updates - when ML packages are updated
  3. Plugin changes - when tools or MCP servers are added, updated, or removed

Automating SBOM generation in the ML CI/CD pipeline is the only practical approach at scale. Tools like cyclonedx-python can generate SBOMs from Python environments; custom tooling is typically required for model artifact tracking.


Practical Controls

Control 1: Artifact Integrity Verification

For every model artifact downloaded from an external source:

  1. Download the official checksum file from the model provider’s primary source (not the same source as the model - use the provider’s official website or a pinned release page)
  2. Verify SHA-256 of downloaded model files against the checksum
  3. Store verified checksums in your AI SBOM
  4. Re-verify at load time in production using a startup integrity check

For Hugging Face: Use the safetensors format exclusively. Verify against the model card’s listed checksums. Where possible, use the huggingface_hub library’s built-in checksum verification.


Control 2: Package Registry Hardening

For Python ML packages:

  • Use pip install --require-hashes with a pinned requirements.txt for all production environments
  • Audit requirements.txt changes in code review - treat dependency additions as security-relevant changes
  • Run pip-audit and safety check in CI/CD on every dependency update
  • Use a private package mirror (Artifactory, Nexus, AWS CodeArtifact) with allowlisted packages for production ML environments

Namespace reservation: Register your organization’s internal package names on PyPI proactively. Even if the packages are internal-only, namespace registration prevents dependency confusion attacks.


Control 3: Model Behavior Verification

For fine-tuned models before deployment:

  • Run a behavioral test suite that specifically tests for known backdoor triggers (common trigger tokens, unusual formatting, specific phrases that have been used in known backdoor research)
  • Test on adversarial inputs that probe for unexpected behavioral changes not present in the base model
  • Document baseline behavior in a model card and compare against this baseline after any fine-tuning update

Control 4: Plugin and MCP Server Vetting

Before integrating any third-party plugin or MCP server:

  • Review all source code - treat it like a privileged internal software component
  • Verify the author identity and repository provenance
  • Test in a sandboxed environment before granting production permissions
  • Apply least-privilege: grant only the permissions the plugin demonstrably requires
  • Pin the plugin version and treat updates as requiring re-review

Control 5: Training Data Auditing

For fine-tuning datasets that include external content:

  • Maintain a record of every data source (URL, dataset name, version, retrieval date)
  • Apply content filtering and anomaly detection to training data before use
  • Consider excluding content from sources you don’t control from fine-tuning datasets; use RAG for dynamic knowledge instead
  • If external content is used in fine-tuning, implement post-training behavioral testing to detect anomalies

The AI Supply Chain Security Maturity Model

Maturity LevelCapabilityIndicators
Level 1 (Ad hoc)No systematic supply chain controlsModel downloads from arbitrary sources, no checksum verification, no dependency pinning
Level 2 (Basic)Awareness and basic controlsChecksums verified manually, dependencies pinned, awareness of known risks
Level 3 (Managed)Systematic controls in CI/CDAutomated integrity verification, package scanning in CI, AI SBOM maintained
Level 4 (Advanced)Proactive supply chain securityPrivate model registry, behavioral testing for backdoors, MCP vetting program
Level 5 (Optimizing)Full supply chain assuranceThird-party audit of model provenance, continuous SBOM monitoring, supplier security requirements

Most organizations operating AI systems in 2026 are at Level 1 or Level 2. Level 3 should be the baseline for any organization operating AI systems with access to sensitive data or with business-critical functions.


Threat Intelligence for AI Supply Chain

Unlike traditional software supply chain intelligence (CVE databases, vendor advisories), AI supply chain threat intelligence is fragmented across multiple sources. Building an effective intelligence program requires monitoring:

Hugging Face Security Advisories: Hugging Face publishes malicious model reports when discovered. Subscribe to their security announcements. Their automated scanning with tools like ProtectAI’s ModelScan identifies some malicious checkpoints, but coverage is not comprehensive.

PyPI and Conda security feeds: The Python Software Foundation’s advisory database covers malicious PyPI packages. SocketDev and Phylum publish threat intelligence on supply chain attacks in the Python ecosystem, including ML-specific packages.

Academic research tracking: Many supply chain attack techniques are first demonstrated in academic papers before appearing in the wild. Tracking publications at venues like IEEE S&P, USENIX Security, and CCS on ML security and supply chain attacks gives advance warning of techniques that will eventually be weaponized.

AI/ML security communities: Communities like MLSecOps and MITRE ATLAS publish practical threat intelligence on AI-specific attacks, including supply chain vectors.

Vendor security bulletins: Follow security advisories from your ML framework vendors (PyTorch, TensorFlow, JAX), your MLOps platform vendors, and your model providers. Framework vulnerabilities can affect the security of your entire AI pipeline.

Threat Hunting in AI Supply Chain

Periodic threat hunting for supply chain indicators includes:

  • Searching your model registry for checkpoints using unsafe serialization formats
  • Auditing recently installed packages against PyPI package reputation databases
  • Checking your dependency tree for packages with unusually recent creation dates and high install counts (common attributes of typosquatting packages)
  • Reviewing MCP server code for recently added functionality that was not in previous releases

Incident Response for Supply Chain Compromise

If you suspect a supply chain compromise, the response differs from application-layer incidents:

Immediate containment: If the suspected compromise is a model weight or package, stop using the artifact immediately. If it was deployed to production, consider taking the affected service offline pending investigation. The risk of continued operation with a potentially compromised model is high.

Scope determination: Determine which systems loaded the potentially compromised artifact, and when. Every system that loaded a compromised model checkpoint or installed a malicious package is potentially compromised at the OS level (for serialization attacks) or the process level (for package attacks).

Forensic preservation: Preserve the suspicious artifact before removing it. Hash it and store the hash. This is evidence.

System integrity assessment: For any system that loaded a potentially compromised artifact, perform a standard host forensics process: check for persistence mechanisms, unusual network connections, new processes, file system modifications. Treat these systems as potentially compromised at the infrastructure level.

Clean rebuild: After confirming the scope, rebuild affected systems from known-good baselines. Do not “clean” a potentially compromised training or serving host - rebuild it.


Our AI Security Assessment service includes a supply chain component that reviews your model artifact integrity controls, ML package dependency practices, plugin vetting procedures, and AI SBOM maturity. Contact us to scope an assessment.

For supply chain monitoring in production - detecting anomalous model behavior that may indicate compromise - see secops.qa for continuous AI security operations coverage.

Know Your AI Attack Surface

Request a free AI Security Scorecard assessment and discover your AI exposure in 5 minutes.

Get Your Free Scorecard