API Relay Audit — AI 中转站安全审计

What Does It Detect?

Threat taxonomy based on Liu et al., "Your Agent Is Mine" (arXiv:2604.08407)

Step 1-2

Infrastructure Recon

DNS, CDN, SSL certificate, management panel fingerprint, model list enumeration — understand what's behind the relay.

Step 3

Token Injection (AC-1)

Compares actual token usage against expected values. Hidden system prompt injection adds extra tokens — the delta reveals it.

Step 4 & 6

Prompt Extraction

3 attack vectors attempt to extract hidden system prompts: verbatim recall, translation trick, JSON continuation. Plus jailbreak resistance tests.

Step 5

Identity Substitution

24 keywords detect if "Claude" is actually GPT, DeepSeek, GLM, Qwen, or other models in disguise. Anchor phrases confirm true identity.

Step 7

Context Truncation

5 canary markers + binary search pinpoint the real context window boundary. Is your 200K context really 200K?

Step 8 (AC-1.a)

Tool-Call Rewriting

Checks if the relay silently modifies package install commands in responses — typosquatting supply-chain attacks at the proxy layer.

Step 9 (AC-2)

Error Response Leakage

7 deliberately broken requests probe for API key, env vars, file paths, and LiteLLM internals leaking in error responses.

Step 10-11

Stream Integrity & Web3

SSE event whitelist, usage monotonicity, thinking signature validity, model identity check. Plus Web3 signature-isolation probes (profile-gated).

How We Compare

Three tools, three approaches — pick the right one for your needs

Dimension	api-relay-audit	hvoy.ai	cctest.ai
Token Injection	✓	✕	✓
Prompt Extraction	✓	✕	✕
Identity Substitution	✓	✓	✕
Jailbreak Resistance	✓	✕	✕
Context Truncation	✓	✕	✕
Tool-Call Rewriting (AC-1.a)	✓	✕	✕
Error Response Leakage (AC-2)	✓	✕	✕
Stream Integrity (SSE)	✓	✓	✕
Web3 Injection	✓	✕	✕
Channel Fingerprint	Soon	✕	✓
Local Execution (Key stays local)	✓	✕	✕
Fully Open Source	✓	Partial	✕
Public Leaderboard	✕	✓	✕
Structured Audit Report	✓	✕	✕

FAQ

What is an API relay / proxy?

An API relay (also called "中转站" in Chinese) is a third-party service that sits between you and an AI provider like Anthropic or OpenAI. You send your requests to the relay, and it forwards them to the upstream provider. Relays exist because some users can't access AI APIs directly (geo restrictions, payment issues), or because relays offer cheaper per-token pricing by pooling API keys. The problem: a malicious relay can inject hidden instructions, swap the model, truncate your context, or even steal your credentials from error responses.

Is it safe to enter my API Key?

api-relay-audit runs entirely on your machine. Your API Key is only sent to the relay URL you specify — the same relay you're already using. No data is sent to us, no telemetry, no analytics. The standalone version is a single Python file with zero dependencies (just stdlib + curl) — you can read every line of code before running it. This is fundamentally different from web-based tools that ask you to enter your key into a webpage.

What's the difference between this and hvoy.ai / cctest.ai?

Three tools, three approaches. hvoy.ai excels at information aggregation — its public leaderboard of 40+ relays is the fastest way to check a relay's reputation. cctest.ai offers one-click convenience and a unique channel fingerprint (protobuf signature parsing). api-relay-audit covers 13 audit dimensions (vs ~5 for hvoy, 5 for cctest), runs locally, and is fully open-source. Only api-relay-audit detects tool-call rewriting (AC-1.a), error response credential leakage (AC-2), context truncation, and Web3 injection. Pick based on your needs: quick lookup → hvoy; one-click check → cctest; full security audit → api-relay-audit. Full comparison →

What does "token injection" mean?

When you send a message to an AI API, the relay may secretly prepend a hidden system prompt. This injects extra tokens into your request — for example, 3200 tokens of instructions telling the model to behave as "Kiro" instead of Claude. You pay for these hidden tokens, and worse, they can override your own instructions. api-relay-audit detects this by comparing actual token counts against expected values: if you sent 100 tokens but the model reports using 3300, there are ~3200 tokens of injection. Similar to the "掺水率" (dilution rate) concept from hvoy.ai, but measured as an absolute token delta for precision.

What is AC-1.a tool-call rewriting?

AC-1.a (from the threat taxonomy in arXiv:2604.08407) is a supply-chain attack where the relay modifies package names in the model's response. For example, the model says pip install requests==2.31.0, but the relay changes it to a typosquatted package. If you're using Claude Code or an AI coding agent that automatically installs packages, this is a real attack vector. api-relay-audit sends pinned package-install commands and compares received text character-by-character to detect any modification.

What does "inconclusive" mean in the tri-state verdict?

Every audit step returns one of three states: clean (no anomaly detected), anomaly (issue confirmed), or inconclusive (the test couldn't determine a result — e.g., the relay blocked the probe or returned an error). This is important because "couldn't test" and "confirmed safe" are very different things. A relay that blocks all probes gets "inconclusive," not "clean." Numeric scoring systems (0-100) blur this distinction — a blocked test might score 50, which looks like "okay" when it should mean "suspicious."

Does this work with non-Claude models?

Most steps (1-9) work with any model through any relay. Step 10 (stream integrity) is Anthropic-specific because it validates Anthropic's SSE event schema, usage fields, and thinking signatures. Step 11 (Web3 injection) is model-agnostic but profile-gated. If your relay uses OpenAI format, the tool auto-detects and adapts the request format. Stream integrity will return "inconclusive" for non-Anthropic formats.

AI API Relay Security Audit

Audit Report Demo