Secure Tool Calling for LLMs: Function Calling Risks and Runtime Controls
On this page
The Security Problem Starts the Moment the Model Can Do Something
There is a clean line in AI security between systems that generate text and systems that can act. Tool calling crosses that line.
Once a model can:
- send email
- modify a ticket
- search a private knowledge base
- create a refund
- rotate infrastructure state
the conversation layer is no longer the only security boundary that matters. The runtime around the tool invocation becomes the real control plane.
Why Tool Calling Fails in Practice
Most implementations start from the happy path. A tool is defined, the schema is valid, the model calls it correctly, and the app executes the request.
What gets missed are the harder questions:
- should the model be allowed to call this tool at all?
- should it be allowed to call it without confirmation?
- does the tool enforce its own authorization?
- are arguments validated independently from the model output?
- is there a dry-run mode for risky actions?
If those questions are still undecided, the tool is not production ready.
A Tool Definition Is Not a Security Policy
Anthropic's agent guidance makes an important point: tool definitions and documentation need careful engineering because the model depends on them to act correctly. That is true, but it is only half the job.
Good tool docs reduce mistakes. They do not replace runtime enforcement.
Risk Tiers Help More Than Long Discussions
One practical way to design safe tool use is to classify tools by impact:
| Risk Tier | Example Tools | Default Control |
|---|---|---|
| Low | search docs, summarize ticket, read feature flags | allow with logging |
| Medium | draft email, create issue, update internal notes | allow with output review |
| High | transfer money, delete data, modify permissions | require confirmation or human approval |
If a team cannot agree on a tool's tier, that is usually a sign the tool should not be delegated yet.
Unsafe Pattern
if (toolCall.name === "deleteCustomer") {
await deleteCustomer(toolCall.arguments.customerId);
}
This trusts:
- the model's choice of tool
- the model's selected arguments
- the app's assumption that a valid schema equals safe intent
None of those is enough on its own.
Safer Pattern
function requiresApproval(toolName: string) {
return new Set(["deleteCustomer", "refundPayment", "changeRole"]).has(toolName);
}
async function executeTool(toolName: string, args: Record<string, unknown>, actorId: string) {
validateToolArguments(toolName, args);
await authorizeToolUse(toolName, actorId, args);
if (requiresApproval(toolName)) {
return { status: "pending_approval" };
}
return runTool(toolName, args);
}
This is still simple, but it treats schema validation, authorization, and approval as separate decisions.
Runtime Controls That Matter
1. Tool-Specific Authorization
Do not assume the agent's top-level identity is enough. Each tool should check whether the caller may perform that action.
2. Confirmation for Destructive or External Actions
Sending data outside the organization or deleting internal state should not happen silently because the model felt confident.
3. Argument Validation Outside the Model
The model can propose an argument. The application must validate it.
4. Dry Run Paths
The ability to preview "what would happen" is one of the easiest ways to reduce tool misuse.
5. Full Audit Trails
You want to know:
- what the user asked
- why the model selected the tool
- what arguments were proposed
- what was executed
- whether approval was required or bypassed
The Hidden Problem: Over-Broad Tools
Sometimes the issue is not the runtime. It is the tool itself.
A tool like adminAction(command: string) is almost impossible to secure because it collapses too many decisions into one primitive. The best AI tool interfaces are usually:
- narrow
- well documented
- specific to one action
- hard to use incorrectly
That is not just good developer experience. It is security design.
What to Red-Team
Try these cases:
- prompt injection causes the model to choose a higher-risk tool
- model submits valid JSON with unsafe values
- tool description is ambiguous and the model selects the wrong action
- user asks for a harmless task and the model overreaches into state-changing behavior
- approval flows are skipped because the app treats retries or streaming events incorrectly
Tool Calling Checklist
- define tool risk tiers before rollout
- authorize every tool independently
- validate arguments outside the model output
- require approval for destructive or external actions
- prefer narrow tools over generic admin tools
- log tool selection, arguments, and execution outcomes
- include dry-run or preview paths for risky actions
Sources and Further Reading
Related Reading on SecureCodeReviews
- API Security for AI Agents: Securing MCP, Function Calling & Tool Use
- How to Secure AI Agents: Identity & Access Management for Agentic AI
- OWASP Top 10 for Agentic AI 2026: Complete Security Guide
Final Takeaway
Tool calling should be treated the way mature teams treat cloud automation: every action needs a permission model, an audit trail, and a clear escalation path. The model can suggest a tool. It should never be the only thing standing between a user prompt and a high-impact action.
Planning an AI feature launch or security review?
We assess prompt injection paths, data leakage, tool use, access control, and unsafe AI workflows before they become production problems.
Advertisement
Free Security Tools
Try our tools now
Expert Services
Get professional help
OWASP Top 10
Learn the top risks
Related Articles
AI Security: Complete Guide to LLM Vulnerabilities, Attacks & Defense Strategies 2025
Master AI and LLM security with comprehensive coverage of prompt injection, jailbreaks, adversarial attacks, data poisoning, model extraction, and enterprise-grade defense strategies for ChatGPT, Claude, and LLaMA.
AI Security & LLM Threats: Prompt Injection, Data Poisoning & Beyond
A comprehensive analysis of AI/ML security risks including prompt injection, training data poisoning, model theft, and the OWASP Top 10 for LLM Applications. With practical defenses and real-world examples.
AI Red Teaming: How to Break LLMs Before Attackers Do
A practical guide to AI red teaming — adversarial testing of LLMs, prompt injection techniques, jailbreaking methodologies, and building an AI security testing program.