Skip to content

10 - AI Agents and Orchestration ​


1. What is an AI Agent? ​

What: An AI agent is an LLM-powered system that can autonomously plan and execute multi-step tasks by using tools, observing results, and deciding next actions. Unlike simple tool-calling (one-shot), agents loop until the task is complete.

Simple tool use:     User β†’ LLM β†’ Tool β†’ Response (one step)

Agent:               User β†’ LLM β†’ Tool β†’ Observe β†’ Think β†’ Tool β†’ ... β†’ Response
                           └──────── autonomous loop β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Key properties of agents:

  • Autonomy: Decide which tools to use and when
  • Planning: Break complex tasks into steps
  • Memory: Track progress and context across steps
  • Observation: Interpret tool results and adapt

2. ReAct (Reasoning + Acting) ​

What: The foundational agent pattern. The model alternates between reasoning about what to do and taking actions.

Thought: I need to find the user's order status. Let me search by email.
Action:  search_orders(email="user@example.com")
Observation: Found order #1234, status: shipped, tracking: XYZ123

Thought: I have the order. Now I need to get tracking details.
Action:  get_tracking(tracking_id="XYZ123")
Observation: Package in transit, expected delivery: March 7

Thought: I have all the information needed to respond.
Answer:  Your order #1234 has been shipped and is expected to arrive March 7.

Implementation:

python
REACT_SYSTEM_PROMPT = """You are a helpful assistant with access to tools.
For each step:
1. Think about what you need to do next
2. Use a tool if needed
3. Observe the result
4. Repeat until you can give a final answer

Always think before acting."""

def react_agent(query, tools, max_steps=10):
    messages = [{"role": "user", "content": query}]

    for step in range(max_steps):
        response = llm.create(
            messages=messages,
            tools=tools
        )

        if response.stop_reason == "end_turn":
            return response.content[0].text  # Final answer

        # Execute tools and continue
        messages.append({"role": "assistant", "content": response.content})
        tool_results = execute_tools(response.content)
        messages.append({"role": "user", "content": tool_results})

    return "Max steps reached without completing task."

3. Plan-and-Execute ​

What: Instead of interleaving thinking and acting, first create a full plan, then execute steps sequentially. Better for complex multi-step tasks.

Planning phase:
  "To deploy the application, I need to:
   1. Run tests
   2. Build the Docker image
   3. Push to registry
   4. Update Kubernetes deployment
   5. Verify health check"

Execution phase:
  Step 1: run_tests() β†’ All 42 tests passed
  Step 2: build_docker(tag="v1.2.3") β†’ Image built
  Step 3: push_image("v1.2.3") β†’ Pushed to registry
  Step 4: update_k8s(image="v1.2.3") β†’ Deployment updated
  Step 5: health_check() β†’ 200 OK

  βœ“ Plan complete.

When to use each:

PatternBest For
ReActExploratory tasks, unclear goals, simple workflows
Plan-and-ExecuteWell-defined multi-step tasks, complex workflows
HybridPlan first, then ReAct within each step

4. LangChain ​

What: The most popular framework for building LLM applications and agents. Provides abstractions for chains, agents, tools, memory, and retrieval.

python
from langchain_openai import ChatOpenAI
from langchain.agents import create_tool_calling_agent, AgentExecutor
from langchain_core.tools import tool
from langchain_core.prompts import ChatPromptTemplate

# Define tools
@tool
def search_web(query: str) -> str:
    """Search the web for information."""
    return web_search(query)

@tool
def calculator(expression: str) -> str:
    """Evaluate a math expression."""
    return str(eval(expression))  # simplified

# Create agent
llm = ChatOpenAI(model="gpt-4")
prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant"),
    ("human", "{input}"),
    ("placeholder", "{agent_scratchpad}")
])

agent = create_tool_calling_agent(llm, [search_web, calculator], prompt)
executor = AgentExecutor(agent=agent, tools=[search_web, calculator])

result = executor.invoke({"input": "What's the population of France times 2?"})

LangChain components:

ComponentPurpose
LLMs/ChatModelsModel wrappers (OpenAI, Anthropic, etc.)
PromptsTemplate management
ChainsSequential LLM calls
AgentsAutonomous tool-using loops
ToolsFunction interfaces
MemoryConversation/session state
RetrieversRAG integration

LangGraph (newer): Graph-based agent framework from LangChain for more complex, stateful workflows with cycles and conditional branching.


5. LlamaIndex ​

What: Framework focused on connecting LLMs with data. Best for RAG pipelines, knowledge bases, and data-augmented applications (where LangChain is more general-purpose).

python
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader

# Load and index documents
documents = SimpleDirectoryReader("./data").load_data()
index = VectorStoreIndex.from_documents(documents)

# Query
query_engine = index.as_query_engine()
response = query_engine.query("What is the refund policy?")
print(response)

Key abstractions:

ConceptWhat
DocumentsRaw data (PDFs, web pages, databases)
NodesChunks of documents with metadata
IndicesData structures for retrieval (vector, list, tree, keyword)
Query EnginesEnd-to-end query β†’ retrieve β†’ synthesize pipeline
AgentsLlamaIndex's agent framework for tool use

LangChain vs LlamaIndex:

  • LangChain: General-purpose, agent-focused, more flexibility
  • LlamaIndex: Data/RAG-focused, more opinionated, easier for data pipelines
  • Many projects use both together

6. Memory Patterns ​

What: Agents need to remember context across interactions. Different memory strategies for different needs.

Conversation memory:

Short-term: Full conversation history in context window
  Pro: Perfect recall   Con: Context window limit

Sliding window: Keep last N messages
  Pro: Bounded size   Con: Loses old context

Summary memory: LLM summarizes older messages
  Pro: Compressed history   Con: Lossy, summary may miss details

Token buffer: Keep messages up to token limit, drop oldest
  Pro: Predictable size   Con: Arbitrary cutoff

Long-term memory:

python
# Vector-based memory: store and retrieve past interactions
class VectorMemory:
    def __init__(self, vector_db):
        self.db = vector_db

    def store(self, interaction: str, metadata: dict):
        embedding = embed(interaction)
        self.db.upsert(embedding, metadata)

    def recall(self, query: str, top_k=5):
        query_embedding = embed(query)
        return self.db.query(query_embedding, top_k=top_k)

Working memory (scratchpad):

  • Agent maintains a scratchpad of intermediate results
  • Updated after each tool call
  • Helps track multi-step task progress

7. Multi-Agent Systems ​

What: Multiple specialized agents collaborating on complex tasks, each with their own tools and expertise.

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚            ORCHESTRATOR AGENT           β”‚
β”‚  (Routes tasks, manages workflow)       β”‚
β”‚                                         β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”‚
β”‚  β”‚  Research    β”‚  β”‚  Coding     β”‚     β”‚
β”‚  β”‚  Agent       β”‚  β”‚  Agent      β”‚     β”‚
β”‚  β”‚  (web search,β”‚  β”‚  (file edit,β”‚     β”‚
β”‚  β”‚   reading)   β”‚  β”‚   run code) β”‚     β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β”‚
β”‚                                         β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”‚
β”‚  β”‚  Review     β”‚  β”‚  Deploy     β”‚     β”‚
β”‚  β”‚  Agent       β”‚  β”‚  Agent      β”‚     β”‚
β”‚  β”‚  (code reviewβ”‚  β”‚  (CI/CD,   β”‚     β”‚
β”‚  β”‚   testing)   β”‚  β”‚   infra)    β”‚     β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Patterns:

PatternHowExample
OrchestratorCentral agent delegates to specialistsProject manager routing tasks
PipelineAgents pass work sequentiallyResearch β†’ Write β†’ Review β†’ Publish
DebateAgents argue for/against, third decidesRed team / blue team security
ConsensusMultiple agents propose, vote on bestEnsemble code solutions

Challenges:

  • Communication overhead between agents
  • Error propagation across agent boundaries
  • Debugging multi-agent interactions
  • Cost (multiple LLM calls per step per agent)
  • Coordination and deadlock prevention

Frontend interview preparation reference.