Gated
The audit layer for MCP servers

The MCP servers that matter most are the ones nobody can scan.

They're behind your VPN, on a workstation, in a private subnet. Gated audits those — the same way it audits the public ones.

Early access. No spam, no launch countdown emails.

Audit targetspublic + private
mcp.acme.com
Public endpoint
Audited
mcp.internal.acme
Private · behind the VPN
Audited
localhost:8131
Private · on a workstation
Audited
10.0.2.7:9000
Private · in a subnet
Audited
01 / why now

MCP servers ship faster than anyone audits them. A server goes from prototype to mounted-in-production in an afternoon — the review that would catch a leaking tool or a malformed error envelope never happens.

The agent on the other end is an untrusted component with broad tool access. It will call what it is offered. Every tool, resource, and prompt a server exposes is reachable surface, and most of it was never read by a human.

That surface also has a running cost. It is loaded at initialize, counted against the context window every session, and measured by no one. Shipping the server is the cheap part. Everything it quietly does afterward is not.

02 / what gated does

Gated points two engines at a server and merges what they find into one ranked report — with a reproduction for every finding.

01 The catalog

231 deterministic checks

Across security, conformance, quality, reliability, and cost. Each maps to one family and one severity, runs at a chosen intensity, and is identical on every run — the baseline you can put in CI.

02 The arena

An adversarial LLM

An agent that actively tries to break the server — not only to get in, but to surface flaws across all five families. It improvises the attacks a fixed checklist can't, then files them the same way.

03 / coverage

Five families. 231 checks.

Every check belongs to one family and runs from a declared intensity upward. How the catalog spreads across the two axes:

FamilypassiveprobeexploreadversarialTotal
Security244421190
Conformance37409coming soon86
Quality2111coming soon23
Reliability1036coming soon19
Cost1111coming soon13
All families9399381231
Counts grow as the catalog does.Browse the full catalog
04 / intensity

You choose how hard it looks.

Four strictly-ordered levels — each a superset of the one above it, so Explore runs Passive and Probe too. What the server says about itself, then whether it does what it says, then what happens when you use the whole surface for real, then what a determined attacker can extract.

Passive

Validates everything the server says about itself — serverInfo, declared capabilities, and the full schemas of every tool, resource, and prompt — without ever making a real tool call. The “lint” intensity: safe on production at any time, safe in CI on every PR.

Probe

Starts touching the target, but only safely. Loads resources, calls non-destructive tools, and forces validation failures — negative numbers, out-of-range values, malformed inputs — to see how the server defends its boundaries. No destructive tool is ever called.

Explore

Goes all in on legitimate use. Calls every tool with LLM-generated arguments, walks pagination chains, bursts calls to probe rate-limit behavior, and opens many concurrent connections. Best on staging, or production with explicit opt-in.

Adversarial

An LLM-driven attack — prompt injection, tool poisoning, and sustained multi-step exploitation chains, equivalent to a senior tester actively trying to break the server. Requires explicit opt-in per scan.

FAQ

Straight answers. No fine print.

Missing a question? Email hello@gated.cc — we'd rather answer it directly.

Gated runs five families of checks against every MCP server you point it at. Security — Can the server be coerced into doing something it shouldn't? Auth gaps, injection surfaces, token leaks, tools exposed without scope. Conformance — Does the server follow the specs it claims to? MCP, OAuth, JSON-RPC, HTTP, TLS — all the contracts a stable building block depends on. Quality — Is it pleasant to integrate against? Schema correctness, error shape, descriptive metadata, predictable tool behavior. Reliability — Does it stay correct under load and partial failure? Timeouts, retries, idempotency, cancellation, behavior at the edges. Cost — Will it bankrupt the team that adopts it? Payload bloat, chatty tools, unbounded responses, expensive defaults.
05 / early access

Get in before launch.

Join the waitlist for early access — and 300 free scans when we launch, enough to audit a real fleet of servers before you decide.

We email once, when your access is ready.