Want to experiment (otherwise skip the code)? Below is the exact agent code running in a test-SaaS back-end. Interact with it on the /chat
route and try to break it yourself ( for instructions: https://github.com/sgxgsx/vulnerable_ai_agent).
|
|
THE HIDDEN PARAMETER THAT BREAKS THE RULES
While reviewing Anthropic’s SDK, I noticed several undocumented request fields.
They look like innocent extras - until you see these lines:
|
|
|
|
Because extra_json
takes precedence, we can overwrite the system
parameter for Anthropic (or messages
for OpenAI) and hijack execution flow. It’s a simple, vulnerable pattern with no obvious warning to developers who assume these are just “extra” parameters.
To exploit this vulnerability, two prerequisites must be met:
- An adversary must be able to set an arbitrary key and value—mass assignment via GET/POST parameters, headers, or cookies works well.
- To affect an end user, an attacker must also be able to set those keys on the user’s behalf—again via GET/POST parameters, redirects, or
postMessage
calls.
Impact varies widely. If attackers can set keys on behalf of end users, or if the agent exposes dangerous tools that execute code or leak data, the risk skyrockets.
For example, in the sample code which you reviewed, the agent exposes a call
function that executes shell commands. Once an attacker replaces the system prompt that forbids calling it, remote code execution is possible.
|
|
The chances of encountering this pattern are relatively low, as it’s not a common use of the API. However, I’ve come across a few codebases that let users supply custom extra keys and values. It’s only a matter of time before an application emerges that enables an attacker to exploit this behaviour from source to sink.
Real code from GitHub: Safe vs. Unsafe
I searched GitHub for projects that use extra_body
, ignoring pure CLI apps. Few people share full web-integrated agents, but these two snippets illustrate good and bad practice:
- Safe approach — AstrBot validates every key before merging.
https://github.com/AstrBotDevs/AstrBot/blob/9147cab75bacf5d7f81548d8b209faac13a77d32/astrbot/core/provider/sources/openai_source.py#L103…
- Unsafe approach — MaxKB merges everything without filtering.
https://github.com/1Panel-dev/MaxKB/blob/7ce66a7bf3ddf123a648e23db920eeaeb00cbe42/apps/setting/models_provider/base_model_provider.py#L109…
How to fix it
- Whitelist keys you accept; reject the rest.
- Strip unknown keys if you cannot whitelist.
- Treat
extra_body
as untrusted input.
Key takeaways
- Always review undocumented features :)
Timeline
- 16 Mar 2025 - Reported to Anthropic
- 16 Mar 2025 - Realized OpenAI was vulnerable too; reported to OpenAI
- Some back-and-forth with both vendors
- 21 Mar 2025 - Anthropic accepted the submission, issued a $1000 bounty and is exploring remediation.
- 02 Apr 2025 - OpenAI replied: “works as expected”…
- 28 May 2025 - Anthropic added a warning to the documentation: https://github.com/anthropics/anthropic-sdk-python?tab=readme-ov-file#undocumented-request-params
- 03 Jun 2025 - Full public disclosure so developers can patch their code.