The MCP Security Threat Model — Top 10 Risks Teams Miss
A working threat model for MCP servers, distilled from two dozen engagements. Not a checklist — a ranked list of the ten failure modes we see shipped to production, and what it costs to fix each one after launch.
Over the last eighteen months we’ve reviewed two dozen MCP servers in production or near it. They range from five-tool demos that got promoted to customer-facing status to multi-tenant surfaces serving hundreds of millions of invocations a week. The teams are thoughtful. The tools are competently built. And almost all of them have the same ten problems.
This essay is our working threat model. It is not a checklist. A threat model is a tool for thinking, and the moment it becomes a checklist, it stops doing its job. Use it as a prompt — for each risk, ask “is this us?” and then ask “would we know?”
The frame
An MCP server is, from a security perspective, three things at once: a public API, a prompt surface, and a deputy for the model that calls it. The interesting bugs live at the intersections.
The API frame gives us authorization, rate limits, and audit logging. The prompt frame gives us injection surfaces, untrusted input, and the blurry line between data and instructions. The deputy frame gives us the confused-deputy problem dressed up in new clothes: the agent acts on behalf of a user, but with the server’s privileges, and the handoff is where trust quietly escapes.
R1 · Tool-level data leaks
The first risk is also the easiest to verify and the most common to find. A tool’s docstring advertises one shape; its handler returns a superset. The agent, being a good guest, will only use the advertised fields — until an adversary prompts it otherwise, or a downstream logging path serializes the full response, or a different model version decides the extra fields are worth surfacing.
The fix is pedestrian: a response schema validated in the handler, failing closed. The cost is an afternoon. The cost of finding this in production is a breach post-mortem.
R2 · Broken authorization at the tool boundary
Most MCP servers we review authorize at the session, not at the tool. A caller authenticates, gets a session, and then every tool trusts that the session is good enough. It isn’t.
The right mental model is that each tool is a privileged operation that needs its own authorization policy, expressed against the session principal. In practice: a read tool checks that the principal can see the requested resource. A write tool checks that the principal can make the requested change, and logs it. A dangerous tool — one that can pay money, send mail, or delete data — requires a fresh confirmation that the session is still user-driven.
R3 · Indirect prompt injection via ingested resources
Every resource the agent ingests is attacker-controlled unless proven otherwise. Tickets have bodies. Wiki pages have contents. Webhooks have payloads. File uploads have filenames.
The right assumption is that any of these can instruct the model. The mitigation is not to strip instructions out — you can’t, reliably — but to make the agent’s behavior invariant under adversarial input: a session that has ingested untrusted content cannot call write tools without a fresh user confirmation. That single rule kills the majority of what “prompt injection” actually means in production.
R4 · Unadvertised tools that remain callable
We find this in about half of engagements. A tool is omitted from the tools/listresponse but the server still resolves it if called by name. The author’s mental model was “hidden,” not “gone.”
List omission is not access control. Ever. Either unregister the tool or gate it behind a principal check. If it’s an administrative tool, the gate should require a distinct principal class — not just an elevated flag on a regular session.
R5 · No per-caller rate limiting on read tools
MCP servers usually sit in front of quota-metered downstream services. Without per-principal rate limits, one agent session can exhaust a daily search quota in minutes. This is rarely a direct security breach, but it’s reliably an availability incident. And it’s trivially weaponizable by a malicious prompt asking the agent to “enumerate everything about topic X.”
R6 · Silent tools
A tool whose invocations are not logged, with the caller identity and arguments, is a tool you cannot debug, audit, or defend. We find silent tools everywhere, usually because logging was added at the framework level after the tools were written, and the older tools quietly opted out by not inheriting the right middleware.
R7 · Cross-tenant bleed in multi-tenant servers
The tenant on the session should be the tenant on the database query. In practice, we see tenant checks performed in one layer and tenant-free queries run in another. The pattern to look for is any tool handler that takes an identifier from the caller and resolves it without joining through the tenant.
R8 · Sampling leaks
When the MCP server invokes samplingback into the client’s model, the request body can include context that doesn’t belong in the model provider’s logs. Most teams we review haven’t thought about this surface at all, and default to sending whatever the tool had on hand.
R9 · Token leakage through error messages
Classic, but still common. An upstream error contains a short-lived token; the MCP layer wraps it into a tool response verbatim; the model sees it; it lands in a user-visible chat log. The fix is a scrubbing layer on all outbound tool errors.
R10 · Shadow servers
The server nobody owns. A developer spun it up for a demo in Q2 against real data, it stuck, and now a production chat path depends on it. No oncall, no logging, no threat model, no one to page when it misbehaves.
The only real defense is inventory. An MCP server that is not on your inventory is a liability you are not paying attention to. An MCP server on your inventory is a system you can harden.
What to do Monday morning
Pick one surface. Inventory its tools. For each tool, answer three questions: who can call it, what does it return, and where is the log of the last hundred calls. If you can’t answer all three, that’s the work.
If you’d like a second pair of eyes on it, that’s the work we do. Thirty minutes, free, here.