Secure Tool Calling for LLMs: Function Calling Risks and Runtime Controls

The Security Problem Starts the Moment the Model Can Do Something

There is a clean line in AI security between systems that generate text and systems that can act. Tool calling crosses that line.

Once a model can:

send email
modify a ticket
search a private knowledge base
create a refund
rotate infrastructure state

the conversation layer is no longer the only security boundary that matters. The runtime around the tool invocation becomes the real control plane.

Why Tool Calling Fails in Practice

Most implementations start from the happy path. A tool is defined, the schema is valid, the model calls it correctly, and the app executes the request.

What gets missed are the harder questions:

should the model be allowed to call this tool at all?
should it be allowed to call it without confirmation?
does the tool enforce its own authorization?
are arguments validated independently from the model output?
is there a dry-run mode for risky actions?

If those questions are still undecided, the tool is not production ready.

A Tool Definition Is Not a Security Policy

Anthropic's agent guidance makes an important point: tool definitions and documentation need careful engineering because the model depends on them to act correctly. That is true, but it is only half the job.

Good tool docs reduce mistakes. They do not replace runtime enforcement.

Risk Tiers Help More Than Long Discussions

One practical way to design safe tool use is to classify tools by impact:

Risk Tier	Example Tools	Default Control
Low	search docs, summarize ticket, read feature flags	allow with logging
Medium	draft email, create issue, update internal notes	allow with output review
High	transfer money, delete data, modify permissions	require confirmation or human approval

If a team cannot agree on a tool's tier, that is usually a sign the tool should not be delegated yet.

Unsafe Pattern

if (toolCall.name === "deleteCustomer") {
  await deleteCustomer(toolCall.arguments.customerId);
}

This trusts:

the model's choice of tool
the model's selected arguments
the app's assumption that a valid schema equals safe intent

None of those is enough on its own.

Safer Pattern

function requiresApproval(toolName: string) {
  return new Set(["deleteCustomer", "refundPayment", "changeRole"]).has(toolName);
}

async function executeTool(toolName: string, args: Record<string, unknown>, actorId: string) {
  validateToolArguments(toolName, args);
  await authorizeToolUse(toolName, actorId, args);

  if (requiresApproval(toolName)) {
    return { status: "pending_approval" };
  }

  return runTool(toolName, args);
}

This is still simple, but it treats schema validation, authorization, and approval as separate decisions.

Runtime Controls That Matter

1. Tool-Specific Authorization

Do not assume the agent's top-level identity is enough. Each tool should check whether the caller may perform that action.

2. Confirmation for Destructive or External Actions

Sending data outside the organization or deleting internal state should not happen silently because the model felt confident.

3. Argument Validation Outside the Model

The model can propose an argument. The application must validate it.

4. Dry Run Paths

The ability to preview "what would happen" is one of the easiest ways to reduce tool misuse.

5. Full Audit Trails

You want to know:

what the user asked
why the model selected the tool
what arguments were proposed
what was executed
whether approval was required or bypassed

The Hidden Problem: Over-Broad Tools

Sometimes the issue is not the runtime. It is the tool itself.

A tool like adminAction(command: string) is almost impossible to secure because it collapses too many decisions into one primitive. The best AI tool interfaces are usually:

narrow
well documented
specific to one action
hard to use incorrectly

That is not just good developer experience. It is security design.

What to Red-Team

Try these cases:

prompt injection causes the model to choose a higher-risk tool
model submits valid JSON with unsafe values
tool description is ambiguous and the model selects the wrong action
user asks for a harmless task and the model overreaches into state-changing behavior
approval flows are skipped because the app treats retries or streaming events incorrectly

Tool Calling Checklist

define tool risk tiers before rollout
authorize every tool independently
validate arguments outside the model output
require approval for destructive or external actions
prefer narrow tools over generic admin tools
log tool selection, arguments, and execution outcomes
include dry-run or preview paths for risky actions

Sources and Further Reading

Final Takeaway

Tool calling should be treated the way mature teams treat cloud automation: every action needs a permission model, an audit trail, and a clear escalation path. The model can suggest a tool. It should never be the only thing standing between a user prompt and a high-impact action.

Secure Tool Calling for LLMs: Function Calling Risks and Runtime Controls

The Security Problem Starts the Moment the Model Can Do Something

Why Tool Calling Fails in Practice

A Tool Definition Is Not a Security Policy

Risk Tiers Help More Than Long Discussions

Unsafe Pattern

Safer Pattern

Runtime Controls That Matter

1. Tool-Specific Authorization

2. Confirmation for Destructive or External Actions

3. Argument Validation Outside the Model

4. Dry Run Paths

5. Full Audit Trails

The Hidden Problem: Over-Broad Tools

What to Red-Team

Tool Calling Checklist

Sources and Further Reading

Final Takeaway

Planning an AI feature launch or security review?

Related Articles

AI Security: Complete Guide to LLM Vulnerabilities, Attacks & Defense Strategies 2025

AI Security & LLM Threats: Prompt Injection, Data Poisoning & Beyond

AI Red Teaming: How to Break LLMs Before Attackers Do

Secure Tool Calling for LLMs: Function Calling Risks and Runtime Controls

The Security Problem Starts the Moment the Model Can Do Something

Why Tool Calling Fails in Practice

A Tool Definition Is Not a Security Policy

Risk Tiers Help More Than Long Discussions

Unsafe Pattern

Safer Pattern

Runtime Controls That Matter

1. Tool-Specific Authorization

2. Confirmation for Destructive or External Actions

3. Argument Validation Outside the Model

4. Dry Run Paths

5. Full Audit Trails

The Hidden Problem: Over-Broad Tools

What to Red-Team

Tool Calling Checklist

Sources and Further Reading

Related Reading on SecureCodeReviews

Final Takeaway

Planning an AI feature launch or security review?

Related Articles

AI Security: Complete Guide to LLM Vulnerabilities, Attacks & Defense Strategies 2025

AI Security & LLM Threats: Prompt Injection, Data Poisoning & Beyond

AI Red Teaming: How to Break LLMs Before Attackers Do