Learn

What is Tool Use in AI?

The primitive that turned chatbots into agents.

Tool use is the umbrella term for an AI model invoking external tools — APIs, code execution environments, file systems, web browsers, databases — to accomplish tasks beyond generating text. It encompasses function calling (the API primitive), computer use (clicking/typing in a GUI), code execution, and web browsing, and is the foundational capability that separates a chatbot from an agent.

Free to startNo credit card requiredUpdated Apr 2026
Short answer

Tool use is the umbrella term for an AI model invoking external tools — APIs, code execution environments, file systems, web browsers, databases — to accomplish tasks beyond generating text. It encompasses function calling (the API primitive), computer use (clicking/typing in a GUI), code execution, and web browsing, and is the foundational capability that separates a chatbot from an agent.

In depth

Tool use is the broader concept that function calling is a specific implementation of. When people talk about an 'agent', they almost always mean 'an LLM with tool use plus a loop' — the model decides which tool to use, executes it, observes the result, and decides what to do next. The tools an LLM can use fall into five categories. (1) APIs: HTTP calls to external services — send email, fetch weather, query a CRM, post to Slack. (2) Code execution: running Python, JavaScript, or shell commands in a sandboxed environment; useful for calculations, data analysis, and scripting. (3) File system access: reading and writing files, traversing directories — the primitive underneath coding agents like Cursor and Claude Code. (4) Web browsing: fetching web pages, clicking links, filling forms — can be via API (Perplexity-style) or via actual browser automation (BrowserBase, Playwright-driven agents). (5) Computer use: the most general form — controlling a full OS, clicking pixels, typing in any app — currently offered by Anthropic's Claude computer-use API and OpenAI's Operator. Implementation-wise, tool use relies on function calling as the transport. Every tool (regardless of what it does under the hood) is described to the model as a function with a JSON Schema. When the model wants to use the tool, it outputs a function call. Your runtime intercepts that call, executes the real tool, and returns the result. The model then continues reasoning with the new information. The loop — reason, call tool, observe, reason again — is what gives agents their iterative problem-solving ability. Tool use matters because it collapses two major limitations of raw LLMs. First, knowledge freshness: an LLM trained in 2024 doesn't know today's stock price, but a model with a web-search tool does. Second, action-taking: an LLM alone can only output text, but an LLM with tools can actually send the email, create the task, deploy the code. This is the unlock that turned 'AI assistants' from curiosities into productive workforce members. Quality of tool use varies dramatically across models. Claude 4.5 and GPT-5 handle complex multi-step tool-use scenarios well — they know when to use which tool, pass arguments correctly, and recover from errors. Smaller models often misuse tools (using file-write when file-read was appropriate, passing wrong arg types, getting stuck in retry loops). The cost pattern is important: tool use multiplies inference cost because each tool call is a full LLM round trip. A task with 20 tool calls costs roughly 20× a single-turn answer. Tycoon's AI employees are tool-use agents by construction. Astra (the AI CEO) has tools for assigning tasks, querying metrics, posting messages, reading memory, and invoking specialist agents. An AI CMO has tools for publishing to Ghost, checking Google Analytics, posting to social platforms via Composio. Every time an AI employee 'does' something rather than 'says' something, that's a tool use call underneath.

Examples

  • Claude 4.5 with tools parameter — passes any number of custom tool definitions, including web_search, code_execution, and computer_use (Anthropic's built-ins)
  • GPT-5 with tools and the built-in code interpreter, file search, and web browsing capabilities
  • Claude Code CLI — uses Read, Write, Edit, Bash, Grep, Glob, WebSearch, WebFetch as tools; the agent loop picks the right one per step
  • Cursor Agent mode — tools for reading and editing files in your codebase, running commands, and searching the internet
  • ChatGPT 'advanced data analysis' — tool use for Python code execution on uploaded files
  • Tycoon AI employees — each role has a scoped tool set (AI CMO: Ghost, GA4, LinkedIn API; AI CTO: GitHub, deployment APIs; AI COO: Stripe, project management)
  • Composio — a catalog of 250+ pre-built tools (Slack, Notion, Salesforce, GitHub) exposed to LLMs through a unified tool-use interface

Related terms

Frequently asked questions

Is tool use the same as function calling?

Tool use is the broader concept; function calling is the most common API-level implementation. Every function call is tool use, but tool use also includes higher-level primitives like computer use (controlling a full computer) and code execution (running code the model writes). In casual conversation people use them interchangeably. In docs, 'tool use' is the umbrella; 'function calling' is one specific pattern underneath.

How is tool use different from RAG?

RAG is specifically about retrieving documents and putting them in the prompt before generation. Tool use is a general pattern where the model decides to invoke any external capability — which may or may not include retrieval. You can think of RAG as a specific tool ('retrieve_relevant_docs') that happens to be so common it gets its own name. Agentic RAG, where the model uses a retrieval tool iteratively, is the bridge between the two concepts.

What are the safety concerns with tool use?

Three main categories. (1) Action risk — a misbehaving agent with access to email, payments, or production systems can cause real damage; mitigated by sandboxed execution, dry-run modes, and human approval for high-stakes actions. (2) Prompt injection — a tool result (like a fetched web page) can contain instructions the model follows as if they came from the user; mitigated by treating tool outputs as untrusted data and never executing instructions from them. (3) Cost runaway — agents can enter loops that generate huge bills; mitigated by hard iteration limits and budget caps. Every production tool-use system needs all three layers.

Which tools should I give an AI agent?

The principle of least privilege applies. Give an agent the minimum tool set required for its scope. A marketing agent doesn't need file-system write access. A support agent doesn't need to issue refunds without approval. Start with read-only tools, observe how the agent uses them, and grant write/action tools one at a time as you build confidence. Tycoon implements this through role-scoped tool sets and an explicit autonomy slider — you decide exactly which tools each AI employee can use unilaterally.

Can I build tool use without a framework?

Yes, and it's simpler than most people assume. With any modern LLM API that supports function calling, a tool-use loop is about 40 lines of code: define your tools as JSON schemas, call the LLM, if the response has a tool call execute the tool, append the result to messages, loop until no more tool calls. Frameworks like LangChain and CrewAI add orchestration, memory, and multi-agent coordination on top, but for single-agent scenarios raw API calls are cleaner and more debuggable.

Run your one-person company.

Hire your AI team in 30 seconds. Start for free.

Free to start · No credit card required · Set up in 30 seconds