GEO

Tool Use

Tool use is the capability that lets an LLM call external functions, APIs, databases, or services mid-response — fetching fresh data, running calculations, or taking actions in the world. Instead of being locked to its training data, a tool-using model can reach outside itself when the task demands it.

Tool use is the capability that lets an LLM call external functions, APIs, databases, or services mid-response — fetching fresh data, running calculations, or taking actions in the world. Instead of being locked to its training data, a tool-using model can reach outside itself when the task demands it.

Why It Matters

A bare LLM has three hard limits: a frozen knowledge cutoff, no access to private data, and no ability to act. Tool use removes all three. With tools, the same model can answer "what's the current MRR?", send an email, book a flight, query a database, or execute code. Tool use is the foundation of every modern AI agent, every "copilot," and almost every production LLM application above the chatbot layer.

How It Works

1. Tool definitions: The caller provides the model a list of available tools, each with a name, description, and parameter schema.

2. Model decides to call: When the user's request needs a tool, the model outputs a structured tool call (JSON matching the schema) instead of regular text.

3. Runtime executes the tool: Your code receives the tool call, runs the actual function, and returns the result.

4. Result goes back to the model: The model sees the result as part of its context and continues the response — either answering the user, or calling another tool.

5. Loop until done: Multi-tool tasks chain tool calls until the model produces a final text response.

Tool Use vs Function Calling

The two terms are mostly interchangeable, with a subtle shift:

  • Function calling: The original framing — the model outputs JSON arguments for a single named function.
  • Tool use: The broader framing — tools can be functions, APIs, computer actions, or MCP servers, and the model orchestrates many in sequence.

Anthropic uses "tool use," OpenAI historically used "function calling" and now says "tools." Both describe the same underlying capability.

Kinds of Tools

Retrieval: Fetch documents, search the web, query a database, look up a record.

Computation: Run Python, do math, convert units, parse a file.

Action: Send an email, create a calendar event, post to Slack, update a CRM.

Code execution: A sandboxed interpreter the model can write and run code in.

Computer use: Click, type, and read the screen — the most general tool.

Model-to-model: Delegate to another specialized model (e.g., image generation).

Designing Good Tool Schemas

Clear, short descriptions: The description is how the model decides when to call. Make it unambiguous.

Narrow parameter types: Prefer enums and constrained strings over free text — cuts hallucinated arguments.

Idempotent where possible: If the model might retry, a second call should not double-send the email.

Return structured results: Give the model JSON back, not free text — it reasons about structure better.

Error responses tell the model what to do: "Error: city not found, try a different spelling" is more useful than "500."

Common Mistakes

Too many tools: Past ~20–30 tools, models start picking the wrong one. Group related tools or route through a smaller selection.

Vague descriptions: "utility" doesn't tell the model when to call it. Be specific.

No error handling: Tool failures break the loop. Always return a structured error the model can react to.

Ignoring latency: Every tool call adds a round trip. Parallelize independent calls; batch where possible.

Skipping guardrails: Action-taking tools (send email, transfer money) need human-in-the-loop or strict scoping.

Sources: