May 20, 2026 · 10 min read

ChromaDB Runs Your Code Before It Checks If You're Authenticated—And 73% of the AI Vector Databases on the Internet Still Have the Bug

CVE-2026-45829 lets an unauthenticated request fetch a model from Hugging Face and execute it on the ChromaDB host before the auth check ever fires. HiddenLayer reported it in February. The patch shipped two weeks ago. The exposure surface barely moved.

Abstract visualization of a vector database server with subtle blue light leaking from the rack, conveying an unauthenticated code execution vulnerability in an AI infrastructure component

What Happened

On May 19, 2026, BleepingComputer reported on a maximum severity flaw in ChromaDB, the open source vector database that has become a default component of modern AI applications. The vulnerability is tracked as CVE-2026-45829, and the mechanic is unusually clean: ChromaDB authenticates after it has already executed the user supplied code.

HiddenLayer, the AI security firm that disclosed the bug, summarized it as well as anyone could: "The authentication is not missing, it is just in the wrong place. By the time it fires, the model has already been fetched and executed locally." The auth check still runs. It just runs too late to matter.

What ChromaDB Does

ChromaDB is a vector database. It stores embeddings—the numeric representations of text, images, or other content that AI models use to compare semantic similarity—and provides a query interface for the application layer to look up the most relevant matches. Every retrieval augmented generation (RAG) pipeline, every semantic search feature, every "ask your documents" application built in the last two years has a vector database underneath it. ChromaDB is one of the two or three open source projects most people pick for that role.

The thing ChromaDB stores is by design highly sensitive. A vector database for a legal application has the embeddings of every document in the firm's matter system. A vector database for a healthcare application has the embeddings of patient records. A vector database for a corporate knowledge management system has the embeddings of internal strategy documents. The embeddings themselves are not human readable, but they preserve enough of the original meaning that adversaries can reconstruct content from them with the right inverse model.

A vulnerability that lets an unauthenticated attacker run arbitrary code on the ChromaDB host gives the attacker direct access to the embeddings and the metadata, plus a foothold on whatever segment of the corporate network the database lives on.

The Auth Ordering Bug

CVE-2026-45829 is technically simple. The Python FastAPI version of ChromaDB exposes API endpoints that accept references to embedding models. When the user supplies a model name, the server fetches the model from Hugging Face and uses it to compute embeddings for the requested input. The model file is, at the end of the day, executable Python code. Loading it into the process invokes that code.

The intended security property is that only authenticated users can call this endpoint. The actual security property is that authentication happens after the model has been fetched and loaded. The sequence:

The request hits the endpoint.
The endpoint extracts the model reference from the request body.
The endpoint downloads the model from Hugging Face.
The endpoint loads the model into the Python process. This step executes whatever code the model packages along with its weights.
The endpoint then checks whether the request was authenticated. If not, it returns a 401.

The attacker does not need the 401 response to be a 200. They need the load step to happen, and the load step does happen before the auth check. An attacker who publishes a malicious model to Hugging Face under their own account, then sends an unauthenticated request to a ChromaDB instance referencing that model, gets the ChromaDB host to execute their code. The 401 that comes back afterward is irrelevant.

Hugging Face as the Delivery Channel

The detail that makes the bug interesting beyond its immediate exploitation is the supply chain shape. The malicious code does not have to be uploaded to the ChromaDB server. It does not have to be sent in the request body. It lives on Hugging Face, the public model repository, and ChromaDB obediently fetches it on demand.

This is the same delivery pattern that the OpenOSS Privacy Filter repo on Hugging Face used to ship a Rust infostealer earlier this month. Hugging Face has become the package manager of the AI world, and like every package manager before it, the security model rests on the assumption that authors are who they say they are and that nothing they publish is malicious. The platform's review surface is enormous, the moderation team is small, and the incentives for attackers to publish weaponized models are aligned with the incentives that brought us npm typosquatting and PyPI cryptominer drops.

For an attacker exploiting CVE-2026-45829, the operational flow is short. Publish a malicious model. Find an exposed ChromaDB instance. Send one HTTP request. The model fetches and runs. The attacker has a shell on the database host.

The 73 Percent Number

HiddenLayer ran Shodan queries against the population of ChromaDB instances reachable from the public internet. Approximately 73 percent of them are running versions in the vulnerable range—1.0.0 through 1.5.8. The patch is in 1.5.9, which shipped two weeks before the public disclosure. The gap between patch availability and patch deployment is the security industry's perennial problem, and it is especially acute for AI infrastructure components.

Vector databases are deployed by AI teams. AI teams are not historically the same teams that handle infrastructure patching. The result is a population of internet exposed instances that nobody is responsible for keeping current. A ChromaDB pod spun up by a data scientist for a proof of concept three months ago is probably still running. It is probably still exposed. It is probably still on 1.5.7.

The 73 percent figure also masks the more practical concern: many of these instances are not just running vulnerable software but are also missing authentication entirely. ChromaDB ships with no auth by default. The user has to opt in by configuring a token. The same teams that have not patched are also the teams that have not configured auth. The attacker does not even have to exploit the ordering bug—they just connect, run queries, and walk away with the embeddings.

What Operators Should Do

If you operate a ChromaDB instance:

Patch to 1.5.9 or later. The fix moves the authentication check to the correct position in the request lifecycle.
Remove the database from the public internet. Vector databases should not be reachable from the open web. Put them behind a VPN, a private subnet, or at minimum a strict allow list.
Switch to the Rust frontend if you can. The Rust deployment was not affected by this bug. The Python FastAPI version is the one with the ordering issue.
Audit what models your instances have loaded recently. The exploit fetches arbitrary models from Hugging Face. If your ChromaDB instance has fetched models you do not recognize, treat it as compromised.
Rotate everything the host could access. Once an attacker has code execution on the database, they have whatever credentials, environment variables, and network access the database had.

The patching window for this one is also the disclosure window. HiddenLayer reported the bug in February. The patch shipped two weeks before public disclosure. Attackers have had access to the technical details since the disclosure, and the population of vulnerable systems is large enough that scanning for exposed ChromaDB instances is a profitable use of an afternoon. The race between defenders and attackers on this one started the moment the writeup went up.

The Broader AI Infrastructure Problem

ChromaDB is one of dozens of AI infrastructure components that became production critical in the last 24 months. Vector databases, model serving frameworks, embedding pipelines, agent orchestration platforms, RAG toolkits—all of them are open source projects that were experimental in 2023 and load bearing in 2026. Most of them ship with auth turned off by default. Most of them have not been through a formal security audit. Most of them are operated by teams that have never run production infrastructure before.

CVE-2026-45829 is the maximum severity entry in a category that will produce a lot more entries. The Marimo Python AI notebook RCE was the same shape: AI tooling, no auth by default, internet exposed, trivial exploitation. The LiteLLM supply chain attack on Mercor was a different shape but the same lesson: the AI infrastructure stack is a soft target, and the consequences run all the way to the data the stack was built to handle.

The next twelve months of AI focused breaches are going to be defined by these components. The patches exist. The deployment discipline is the gap.