The New Perimeter

Supply Chains and AI Environments

Jason Haddix
May 01, 2026

Hey everyone!

Today we're going to discuss something that I've been tracking for quite a bit now and trying to understand how to build it into our methodologies here at Arcanum. When providing penetration testing and Red Team services, and even AI-specific auditing or assessment services, it's important to not just sit in the echo chamber of security, but also mimic exactly what the bad guys are doing. In the case of this week's issue, bad guys are targeting the supply chain and AI environments.

The AI development stack has become one of the most actively targeted surfaces in the industry and I want to break down exactly how. At Arcanum we've been watching this shift in real time and it's changed how we scope and run engagements. Here's the full picture.

The TL;DR

Package poisoning and dependency confusion are now top-of-mind and extended by AI. Teams pulling in AI middleware fast are exactly the environment these attacks are designed to exploit. We're incorporating both into red team scoping
The AI-specific attack surface is its own thing. Skills marketplaces with no vetting, IDEs authenticated to frontier model APIs, agent frameworks holding cloud credentials. ClawHub, Vercel, and a string of tool CVEs show how it gets exploited
One compromised AI environment can reach everything. API keys for every frontier model, cloud credentials, CI/CD tokens, publishing access, all in one place
At Arcanum, AI environments are now a legitimate assumed breach starting point. The incident data proves what's reachable. We've built this into how we scope red team and pen test engagements

/ Part One: Supply Chain as Initial Access: Package Poisoning and Dependency Confusion

Supply chain attacks as an initial access vector aren't new. What's changed is the target profile. Teams building AI products move fast, pull in a lot of open-source middleware, and don't always pin versions or audit what they're importing. The packages sitting at the center of AI workflows, LLM gateways, training libraries, agent frameworks, carry enormous credential density. Attackers have done the math and they're going after exactly this. At Arcanum we're now explicitly including supply chain compromise scenarios in red team and pen test scoping conversations because clients aren't thinking about it yet and the real-world breach record says they should be.

Two distinct techniques show up here and they're worth understanding separately because the defense for each is different.

Package Poisoning

This is where an attacker compromises a legitimate package's publishing pipeline and pushes malicious versions under the real package name. Victims pull it in through normal dependency updates because it looks exactly like the package they trust. The campaign below is the clearest current example: TeamPCP and the Shai-Hulud cluster have been running a sustained operation against AI-specific packages since September 2025, specifically targeting the credential-dense middle layer of the AI dev stack.

The Bitwarden entry is the one I keep coming back to. A general credential stealer looks for passwords. This payload was enumerating AI coding tools by name and checking whether they were authenticated. That's targeted selection, not opportunistic compromise.

Dependency Confusion

In this attack you find an internal package name that a company uses privately but hasn't registered publicly on PyPI or npm, then you publish a malicious package under that name. Package managers that check public registries before private ones will pull yours. No need to compromise anything. Just find the gap and plant your package in it. It's particularly relevant for AI teams because a lot of organizations are spinning up internal AI tooling fast under informal names that were never defensively registered. Scoping a DC test against an AI-heavy team almost always turns up unclaimed names worth flagging. We find SO MANY of these in repos on Github and in JavaScript. We also teach a whole module on finding these in other places in our bug hunters course ;)

(Sponsor)

New field guide to AI adoption for IT and security teams

Gartner predicts that 100% of IT ops work will be AI-assisted by 2030. But successful AI adoption requires more than just turning on new features...

Enter: The IT and security field guide to AI adoption.

In this guide, you’ll find:

Inspiration on what workflow automation with AI could look like for you from Vimeo, Canva, Udemy, and JAMF
A step-by-step guide to find the right AI-powered intelligent workflow platform for your organization
Human-in-the-loop best practices to ensure long-term success

Get the guide

/ Part Two: The AI-Specific Attack Surface: Skills, IDEs, and Agent Frameworks

This is where it gets AI-specific. Beyond the supply chain, the tools and ecosystems that are unique to AI development have their own attack surface, and it's not small. Skills marketplaces with no signing or vetting, IDEs that are permanently authenticated to frontier model APIs, agent frameworks sitting on top of cloud credentials and CI/CD tokens. We've started mapping this explicitly in pre-engagement scope conversations because clients aren't modeling it yet.

Skills Marketplaces: ClawHub

ClawHub is a marketplace for AI agent skills for the OpenClaw platform. In February 2026, Snyk found a malicious skill named "clawhub" had racked up 7,743 downloads before removal. After takedown, the attacker returned with "clawdhub1." The payload dropped a trojanized infostealer on Windows and an obfuscated shell script pulling a stage-2 payload on macOS. Skills on ClawHub run with OpenClaw's default broad permissions: filesystem access, email, shell execution. The permission model is exactly why it's worth planting a skill there.

Shoutout to @ZackKorman, who just started his own thing at embroidery.io. He's been doing great hands-on research here: built a live malicious MCP server demonstrating credential theft in practice, demonstrated a persistence attack where a malicious server rewrites its own mcp_config.json via a planted skill, and got Hermitclaw (an OpenClaw fork advertising sandbox isolation) to escape and exfil files. Follow him.

Third-Party AI Integrations: Vercel

The Vercel breach in April 2026 is the cleanest case study for how a third-party AI tool OAuth grant becomes initial access. A Context.ai employee got infected with Lumma Stealer via game exploits in February 2026. Their Google Workspace credentials were in the stolen data. A Vercel employee had connected their enterprise account to Context.ai's AI Office Suite with broad Google Workspace permissions. The attacker walked that OAuth token into Vercel's Workspace, accessed internal environment variables and a limited set of customer credentials. A ShinyHunters persona then posted on BreachForums claiming Vercel source code, npm tokens, and GitHub tokens for $2 million. Vercel's investigation found no npm compromise; the limited credential exposure was confirmed. Vercel stewards Next.js (six million weekly downloads), which is why this got the attention it did. The foothold was an infection at a vendor. The connection was one OAuth grant from one employee.

AI Tool CVEs: The Tools Themselves Are the Vulnerability

We've hit some of these in client environments during pen tests this year. Most organizations aren't patching AI tooling because they don't think of it as attack surface.

LangChain "LangGrinch" (CVE-2025-68664, CVSS 9.3): Serialization injection in LangChain Core's dumps() and dumpd() APIs lets an attacker extract environment variable secrets or execute code via Jinja2 templates. Additional path traversal and SQL injection flaws followed in March 2026.

Claude Code CVE-2026-21852 (Patched): Planting a malicious ANTHROPIC_BASE_URL in a project's .claude/settings.json routes Claude Code's API traffic through an attacker-controlled server, capturing the developer's API key on clone-and-run with no other steps needed. Anthropic patched this by requiring directory trust confirmation before any API requests fire.

Cursor AI (CVSS 8.2): Any installed extension can read Cursor's local SQLite database, which stores API keys and session tokens in plaintext. Disclosed February 2026. No patch as of late April. Cursor has tens of thousands of daily active users running unvetted marketplace extensions.

MCP Protocol Design Flaw: OX Security disclosed an architectural RCE vulnerability in Anthropic's Model Context Protocol, touching 150M+ downloads and more than 7,000 publicly accessible servers (up to 200,000 estimated including internal deployments). Nine of 11 MCP registries they tested could be successfully poisoned. One compromised MCP server reaches every service it's connected to, because MCP servers aggregate credentials for multiple backends in a single process. Anthropic declined to modify the protocol architecture.

Hugging Face LeRobot (CVE-2026-25874, CVSS 9.3): Unauthenticated RCE via pickle.loads() on data received over unencrypted, unauthenticated gRPC connections. Rough irony: Hugging Face built Safetensors specifically because pickle is dangerous.

Claude Code Source Code Leak: On March 31, 2026, Anthropic accidentally published the complete Claude Code source code via an unredacted npm source map, exposing 512,000+ lines of TypeScript. Threat actors used the confusion to distribute trojanized installers dropping Vidar infostealer and GhostSocks proxy backdoor. Accidental disclosure became active credential theft infrastructure within days.

The pattern: AI tools run with high trust, broad credential access, and hook-level execution on developer machines. A flaw in any one of them isn't just a flaw in that tool. It's a key to everything that tool can reach.

/ Part Three: The Blast Radius and Why This Changes Assumed Breach

Assumed breach testing has always started from a realistic initial access scenario: attacker has credentials, attacker is on the network, attacker owns an endpoint. Those are still valid. The last eight months of incident data make a strong case for adding a new one. A developer running a modern AI workflow sits on API keys for every frontier model the org uses, cloud credentials in their agent framework's environment, CI/CD tokens, publishing access, and AI middleware aggregating keys for a dozen external services in a single process. The Mercor breach made the blast radius concrete: entry was one compromised open-source library, exit was internal communications from a company whose clients include two of the largest AI labs in the world.

At Arcanum, we've been adding AI environments as a starting point in assumed breach scoping for exactly this reason. When "here" means a developer's machine running Claude Code, Cursor, and LiteLLM, what's reachable is almost everything the organization's AI stack touches. Clients are consistently surprised when we map this out because they think of their AI tooling as productivity software. We think of it as credential infrastructure. Both things are true at the same time, and only one of those framings shows up in most threat models. If you're running any meaningful volume of AI development and haven't scoped an assumed breach exercise from an AI environment starting point, that's a gap worth fixing.

/ Defensive Posture

Supply chain: Pin AI package versions. Don't float on latest. Audit requirements.txt and package.json for AI middleware specifically. Register any internal package names not yet claimed on PyPI or npm to close the dependency confusion gap defensively.

Marketplaces and integrations: Treat AI skill marketplaces like browser extension stores. Anything with filesystem, email, or shell access needs scrutiny before install. Run a grant audit across enterprise accounts and look specifically for AI productivity tools with broad Google Workspace, M365, or Slack permissions. One OAuth grant from one employee is a real threat model now.

Tool vulnerabilities: Inventory what AI tools are running on developer machines and what credentials each one can access. Patch LangChain. Watch for a Cursor patch. Audit your MCP server inventory and map what each one touches. Rotate API keys on any system that pulled a compromised package version.

Threat model: Add AI environments as a realistic assumed breach starting point. If you're scoping a red team or pen test and it doesn't include developer AI tooling as a potential beachhead, that's a gap. Map the credential blast radius from a compromised developer machine running your AI stack. The map will probably surprise you.

/ Outro

I'm going to keep covering this space closely. Eight months ago the AI dev stack was barely on anyone's radar as an attack surface. Now it's getting hit from multiple angles by groups that clearly understand what's inside it. If you're building with AI or running assessments against organizations that are, this is yours to own.

Happy hacking!!

-Jason

(Sponsor)

Leave Threat Actors Hungry

Threat actors don't brute force their way in. They buy stolen credentials and walk through the front door.

Flare monitors 100M+ stealer logs, thousands of Telegram channels, and hundreds of dark web marketplaces to surface your exposed credentials before attackers can weaponize them. With Entra ID and Okta auto-remediation, compromised credentials are revoked the moment they appear. Stop leaving crumbs.