10 - AI Agents and Orchestration

1. What is an AI Agent?

What: An AI agent is an LLM-powered system that can autonomously plan and execute multi-step tasks by using tools, observing results, and deciding next actions. Unlike simple tool-calling (one-shot), agents loop until the task is complete.

Simple tool use:     User → LLM → Tool → Response (one step)

Agent:               User → LLM → Tool → Observe → Think → Tool → ... → Response
                           └──────── autonomous loop ────────────┘

Key properties of agents:

Autonomy: Decide which tools to use and when
Planning: Break complex tasks into steps
Memory: Track progress and context across steps
Observation: Interpret tool results and adapt

2. ReAct (Reasoning + Acting)

What: The foundational agent pattern. The model alternates between reasoning about what to do and taking actions.

Thought: I need to find the user's order status. Let me search by email.
Action:  search_orders(email="user@example.com")
Observation: Found order #1234, status: shipped, tracking: XYZ123

Thought: I have the order. Now I need to get tracking details.
Action:  get_tracking(tracking_id="XYZ123")
Observation: Package in transit, expected delivery: March 7

Thought: I have all the information needed to respond.
Answer:  Your order #1234 has been shipped and is expected to arrive March 7.

Implementation:

python

REACT_SYSTEM_PROMPT = """You are a helpful assistant with access to tools.
For each step:
1. Think about what you need to do next
2. Use a tool if needed
3. Observe the result
4. Repeat until you can give a final answer

Always think before acting."""

def react_agent(query, tools, max_steps=10):
    messages = [{"role": "user", "content": query}]

    for step in range(max_steps):
        response = llm.create(
            messages=messages,
            tools=tools
        )

        if response.stop_reason == "end_turn":
            return response.content[0].text  # Final answer

        # Execute tools and continue
        messages.append({"role": "assistant", "content": response.content})
        tool_results = execute_tools(response.content)
        messages.append({"role": "user", "content": tool_results})

    return "Max steps reached without completing task."

3. Plan-and-Execute

What: Instead of interleaving thinking and acting, first create a full plan, then execute steps sequentially. Better for complex multi-step tasks.

Planning phase:
  "To deploy the application, I need to:
   1. Run tests
   2. Build the Docker image
   3. Push to registry
   4. Update Kubernetes deployment
   5. Verify health check"

Execution phase:
  Step 1: run_tests() → All 42 tests passed
  Step 2: build_docker(tag="v1.2.3") → Image built
  Step 3: push_image("v1.2.3") → Pushed to registry
  Step 4: update_k8s(image="v1.2.3") → Deployment updated
  Step 5: health_check() → 200 OK

  ✓ Plan complete.

When to use each:

Pattern	Best For
ReAct	Exploratory tasks, unclear goals, simple workflows
Plan-and-Execute	Well-defined multi-step tasks, complex workflows
Hybrid	Plan first, then ReAct within each step

4. LangChain

What: The most popular framework for building LLM applications and agents. Provides abstractions for chains, agents, tools, memory, and retrieval.

python

from langchain_openai import ChatOpenAI
from langchain.agents import create_tool_calling_agent, AgentExecutor
from langchain_core.tools import tool
from langchain_core.prompts import ChatPromptTemplate

# Define tools
@tool
def search_web(query: str) -> str:
    """Search the web for information."""
    return web_search(query)

@tool
def calculator(expression: str) -> str:
    """Evaluate a math expression."""
    return str(eval(expression))  # simplified

# Create agent
llm = ChatOpenAI(model="gpt-4")
prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant"),
    ("human", "{input}"),
    ("placeholder", "{agent_scratchpad}")
])

agent = create_tool_calling_agent(llm, [search_web, calculator], prompt)
executor = AgentExecutor(agent=agent, tools=[search_web, calculator])

result = executor.invoke({"input": "What's the population of France times 2?"})

LangChain components:

Component	Purpose
LLMs/ChatModels	Model wrappers (OpenAI, Anthropic, etc.)
Prompts	Template management
Chains	Sequential LLM calls
Agents	Autonomous tool-using loops
Tools	Function interfaces
Memory	Conversation/session state
Retrievers	RAG integration

LangGraph (newer): Graph-based agent framework from LangChain for more complex, stateful workflows with cycles and conditional branching.

5. LlamaIndex

What: Framework focused on connecting LLMs with data. Best for RAG pipelines, knowledge bases, and data-augmented applications (where LangChain is more general-purpose).

python

from llama_index.core import VectorStoreIndex, SimpleDirectoryReader

# Load and index documents
documents = SimpleDirectoryReader("./data").load_data()
index = VectorStoreIndex.from_documents(documents)

# Query
query_engine = index.as_query_engine()
response = query_engine.query("What is the refund policy?")
print(response)

Key abstractions:

Concept	What
Documents	Raw data (PDFs, web pages, databases)
Nodes	Chunks of documents with metadata
Indices	Data structures for retrieval (vector, list, tree, keyword)
Query Engines	End-to-end query → retrieve → synthesize pipeline
Agents	LlamaIndex's agent framework for tool use

LangChain vs LlamaIndex:

LangChain: General-purpose, agent-focused, more flexibility
LlamaIndex: Data/RAG-focused, more opinionated, easier for data pipelines
Many projects use both together

6. Memory Patterns

What: Agents need to remember context across interactions. Different memory strategies for different needs.

Conversation memory:

Short-term: Full conversation history in context window
  Pro: Perfect recall   Con: Context window limit

Sliding window: Keep last N messages
  Pro: Bounded size   Con: Loses old context

Summary memory: LLM summarizes older messages
  Pro: Compressed history   Con: Lossy, summary may miss details

Token buffer: Keep messages up to token limit, drop oldest
  Pro: Predictable size   Con: Arbitrary cutoff

Long-term memory:

python

# Vector-based memory: store and retrieve past interactions
class VectorMemory:
    def __init__(self, vector_db):
        self.db = vector_db

    def store(self, interaction: str, metadata: dict):
        embedding = embed(interaction)
        self.db.upsert(embedding, metadata)

    def recall(self, query: str, top_k=5):
        query_embedding = embed(query)
        return self.db.query(query_embedding, top_k=top_k)

Working memory (scratchpad):

Agent maintains a scratchpad of intermediate results
Updated after each tool call
Helps track multi-step task progress

7. Multi-Agent Systems

What: Multiple specialized agents collaborating on complex tasks, each with their own tools and expertise.

┌────────────────────────────────────────┐
│            ORCHESTRATOR AGENT           │
│  (Routes tasks, manages workflow)       │
│                                         │
│  ┌─────────────┐  ┌─────────────┐     │
│  │  Research    │  │  Coding     │     │
│  │  Agent       │  │  Agent      │     │
│  │  (web search,│  │  (file edit,│     │
│  │   reading)   │  │   run code) │     │
│  └─────────────┘  └─────────────┘     │
│                                         │
│  ┌─────────────┐  ┌─────────────┐     │
│  │  Review     │  │  Deploy     │     │
│  │  Agent       │  │  Agent      │     │
│  │  (code review│  │  (CI/CD,   │     │
│  │   testing)   │  │   infra)    │     │
│  └─────────────┘  └─────────────┘     │
└────────────────────────────────────────┘

Patterns:

Pattern	How	Example
Orchestrator	Central agent delegates to specialists	Project manager routing tasks
Pipeline	Agents pass work sequentially	Research → Write → Review → Publish
Debate	Agents argue for/against, third decides	Red team / blue team security
Consensus	Multiple agents propose, vote on best	Ensemble code solutions

Challenges:

Communication overhead between agents
Error propagation across agent boundaries
Debugging multi-agent interactions
Cost (multiple LLM calls per step per agent)
Coordination and deadlock prevention

10 - AI Agents and Orchestration ​

1. What is an AI Agent? ​

2. ReAct (Reasoning + Acting) ​

3. Plan-and-Execute ​

4. LangChain ​

5. LlamaIndex ​

6. Memory Patterns ​

7. Multi-Agent Systems ​

10 - AI Agents and Orchestration

1. What is an AI Agent?

2. ReAct (Reasoning + Acting)

3. Plan-and-Execute

4. LangChain

5. LlamaIndex

6. Memory Patterns

7. Multi-Agent Systems