09 - Tool Use and Function Calling

1. What is Function Calling?

What: A capability where LLMs can output structured tool invocations instead of plain text. The model decides when to call a function, what arguments to pass, and how to incorporate the result into its response.

┌──────┐     ┌─────────┐     ┌──────────┐     ┌─────────┐
│ User │────→│  LLM    │────→│ Tool Call │────→│ Execute │
│      │     │         │     │ (JSON)   │     │ Function│
│      │     │         │←────│          │←────│         │
│      │←────│ Response│     └──────────┘     └─────────┘
└──────┘     └─────────┘
              uses tool
              result in
              final answer

Important: The model doesn't execute functions — it outputs a structured request. Your application code executes the function and feeds results back to the model.

2. Tool Schemas (JSON Schema)

What: Tools are defined using JSON Schema, telling the model what functions are available, what parameters they accept, and what they do.

python

# OpenAI format
tools = [{
    "type": "function",
    "function": {
        "name": "search_products",
        "description": "Search for products in the catalog by name or category",
        "parameters": {
            "type": "object",
            "properties": {
                "query": {
                    "type": "string",
                    "description": "Search query string"
                },
                "category": {
                    "type": "string",
                    "enum": ["electronics", "clothing", "books"],
                    "description": "Product category filter"
                },
                "max_price": {
                    "type": "number",
                    "description": "Maximum price in USD"
                }
            },
            "required": ["query"]
        }
    }
}]

python

# Anthropic/Claude format
tools = [{
    "name": "search_products",
    "description": "Search for products in the catalog by name or category",
    "input_schema": {
        "type": "object",
        "properties": {
            "query": {"type": "string", "description": "Search query"},
            "category": {"type": "string", "enum": ["electronics", "clothing", "books"]},
            "max_price": {"type": "number"}
        },
        "required": ["query"]
    }
}]

Schema best practices:

Write clear, specific description fields — the model relies on these heavily
Use enum for constrained values
Mark truly required fields in required array
Keep parameter names descriptive and conventional

3. Orchestration Loop

What: The core pattern for tool-using LLMs. The application runs a loop: send messages to the model, check if it wants to call tools, execute them, and feed results back.

python

import anthropic

client = anthropic.Anthropic()

def run_agent(user_message: str, tools: list):
    messages = [{"role": "user", "content": user_message}]

    while True:
        # Step 1: Call the model
        response = client.messages.create(
            model="claude-sonnet-4-20250514",
            max_tokens=4096,
            tools=tools,
            messages=messages
        )

        # Step 2: Check if model wants to use tools
        if response.stop_reason == "end_turn":
            # Model is done — extract text response
            return response.content[0].text

        if response.stop_reason == "tool_use":
            # Step 3: Execute each tool call
            tool_results = []
            for block in response.content:
                if block.type == "tool_use":
                    result = execute_tool(block.name, block.input)
                    tool_results.append({
                        "type": "tool_result",
                        "tool_use_id": block.id,
                        "content": str(result)
                    })

            # Step 4: Feed results back and continue loop
            messages.append({"role": "assistant", "content": response.content})
            messages.append({"role": "user", "content": tool_results})

def execute_tool(name: str, args: dict):
    if name == "search_products":
        return search_products(**args)
    elif name == "get_weather":
        return get_weather(**args)
    raise ValueError(f"Unknown tool: {name}")

The loop pattern:

User message
     ↓
┌─→ LLM call
│    ↓
│   Stop reason?
│    ├── "end_turn" → Return text response (done)
│    └── "tool_use" → Execute tool(s)
│                      ↓
│                   Append tool results to messages
└────────────────────┘

4. Error Handling

What: Tools can fail. The model needs to know about failures so it can retry, try alternatives, or inform the user.

python

# Return errors as tool results — don't throw
def execute_tool_safely(name: str, args: dict) -> str:
    try:
        result = execute_tool(name, args)
        return json.dumps({"success": True, "data": result})
    except ValueError as e:
        return json.dumps({"success": False, "error": f"Invalid input: {e}"})
    except TimeoutError:
        return json.dumps({"success": False, "error": "Request timed out"})
    except Exception as e:
        return json.dumps({"success": False, "error": f"Unexpected error: {e}"})

Anthropic API — error flag:

python

tool_results.append({
    "type": "tool_result",
    "tool_use_id": block.id,
    "content": "Error: API rate limit exceeded",
    "is_error": True  # tells the model this is an error
})

Best practices:

Always return errors as tool results, never crash the loop
Include actionable error messages ("Invalid date format, expected YYYY-MM-DD")
Set a maximum loop iteration limit to prevent infinite tool-calling
Log tool calls and results for debugging

5. Parallel Tool Calls

What: Some models can request multiple tool calls in a single turn. This is useful when tools are independent and can run concurrently.

python

# Model response might contain multiple tool_use blocks:
# [
#   {"type": "text", "text": "Let me check both..."},
#   {"type": "tool_use", "name": "get_weather", "input": {"city": "NYC"}},
#   {"type": "tool_use", "name": "get_weather", "input": {"city": "London"}}
# ]

import asyncio

async def execute_tools_parallel(tool_calls):
    tasks = []
    for call in tool_calls:
        tasks.append(execute_tool_async(call.name, call.input))

    results = await asyncio.gather(*tasks, return_exceptions=True)

    tool_results = []
    for call, result in zip(tool_calls, results):
        if isinstance(result, Exception):
            tool_results.append({
                "type": "tool_result",
                "tool_use_id": call.id,
                "content": f"Error: {result}",
                "is_error": True
            })
        else:
            tool_results.append({
                "type": "tool_result",
                "tool_use_id": call.id,
                "content": str(result)
            })
    return tool_results

When models use parallel calls:

Independent data lookups ("What's the weather in NYC and London?")
Multiple search queries
Batch operations

When models use sequential calls:

Results of one call inform the next
Conditional logic ("If X, then do Y")
Multi-step workflows

6. Tool Design Patterns

Granular vs coarse tools:

Too granular (anti-pattern):
  - open_database_connection()
  - execute_query(sql)
  - close_connection()
  Model must orchestrate low-level steps → error-prone

Too coarse (anti-pattern):
  - do_everything(task_description)
  Model can't express nuanced requests → inflexible

Right level:
  - search_users(query, filters)
  - get_user_details(user_id)
  - update_user(user_id, fields)
  Each tool does one meaningful operation

Confirmation pattern:

python

# For destructive operations, use a two-step pattern:
# Step 1: Preview tool
tools = [{
    "name": "preview_delete",
    "description": "Shows what would be deleted WITHOUT actually deleting",
    ...
}]

# Step 2: Confirm tool (only after preview)
tools.append({
    "name": "confirm_delete",
    "description": "Actually performs the deletion. Must call preview_delete first.",
    ...
})

Context window management:

Keep tool results concise — large results consume tokens
Summarize or truncate large datasets before returning
Use pagination for list operations
Return only fields the model needs

7. Comparison: OpenAI vs Anthropic vs Open Source

Feature	OpenAI	Anthropic (Claude)	Open Source (Ollama)
Schema format	`parameters`	`input_schema`	Varies by model
Parallel calls	Yes	Yes	Model-dependent
Streaming	Tool call chunks	Tool use events	Varies
Forced tool use	`tool_choice: {"name": "X"}`	`tool_choice: {"type": "tool", "name": "X"}`	Not standardized
Auto tool choice	`tool_choice: "auto"`	`tool_choice: {"type": "auto"}`	Varies

09 - Tool Use and Function Calling ​

1. What is Function Calling? ​

2. Tool Schemas (JSON Schema) ​

3. Orchestration Loop ​

4. Error Handling ​

5. Parallel Tool Calls ​

6. Tool Design Patterns ​

7. Comparison: OpenAI vs Anthropic vs Open Source ​

09 - Tool Use and Function Calling

1. What is Function Calling?

2. Tool Schemas (JSON Schema)

3. Orchestration Loop

4. Error Handling

5. Parallel Tool Calls

6. Tool Design Patterns

7. Comparison: OpenAI vs Anthropic vs Open Source