How to Create Agents using OpenAI Agent SDK

Now that you understand the basics of the OpenAI Agent SDK, let’s dive deep into creating sophisticated agents with advanced features and configurations.

Advanced Agent Configuration

Model Selection and Parameters

You can customize which model your agent uses and how it behaves:

from agents import Agent, Runner
from openai import OpenAI

# Configure the OpenAI client
client = OpenAI(api_key="your-api-key")

agent = Agent(
    name="AdvancedAssistant",
    instructions="You are an expert assistant with advanced reasoning capabilities",
    model="gpt-4o",  # Specify model
    temperature=0.7,  # Control creativity
    max_tokens=1000,  # Limit response length
    client=client     # Custom client configuration
)

Custom Instructions and Personality

Create agents with specific personalities and expertise:

code_reviewer = Agent(
    name="CodeReviewer",
    instructions="""You are a senior software engineer specializing in code reviews.

    Your responsibilities:
    - Review code for bugs, security issues, and best practices
    - Suggest improvements and optimizations
    - Explain your reasoning clearly
    - Be constructive and educational in your feedback

    Always structure your reviews with:
    1. Overall assessment
    2. Specific issues found
    3. Suggestions for improvement
    4. Positive aspects of the code""",
    model="gpt-4o"
)

Working with Sessions

Sessions enable persistent conversations and context management:

from agents import Agent, Runner, Session

# Create a session for conversation continuity
session = Session()

tutor_agent = Agent(
    name="MathTutor",
    instructions="You are a patient math tutor. Build on previous conversations."
)

# First interaction
result1 = Runner.run_sync(
    tutor_agent,
    "I'm struggling with quadratic equations",
    session=session
)

# Follow-up - agent remembers the context
result2 = Runner.run_sync(
    tutor_agent,
    "Can you give me a practice problem?",
    session=session
)

# Access conversation history
print(f"Messages in session: {len(session.messages)}")
for message in session.messages:
    print(f"{message.role}: {message.content}")

Session Management Best Practices

from agents import Session
import json

# Save session for later use
def save_session(session: Session, filename: str):
    """Save session to file for persistence."""
    session_data = {
        'messages': [
            {'role': msg.role, 'content': msg.content}
            for msg in session.messages
        ]
    }
    with open(filename, 'w') as f:
        json.dump(session_data, f)

# Load session from file
def load_session(filename: str) -> Session:
    """Load session from file."""
    with open(filename, 'r') as f:
        session_data = json.load(f)

    session = Session()
    # Reconstruct session messages
    for msg_data in session_data['messages']:
        # Add messages back to session
        pass  # Implementation depends on SDK version

    return session

Advanced Tool Creation

Tools with Complex Parameters

Create tools that handle complex data structures:

from typing import List, Dict, Optional
from pydantic import BaseModel

class TaskItem(BaseModel):
    title: str
    description: str
    priority: str
    due_date: Optional[str] = None

def create_task_list(tasks: List[TaskItem]) -> str:
    """Create a formatted task list from task items."""
    if not tasks:
        return "No tasks provided."

    formatted_tasks = []
    for i, task in enumerate(tasks, 1):
        task_str = f"{i}. {task.title} ({task.priority} priority)"
        if task.due_date:
            task_str += f" - Due: {task.due_date}"
        task_str += f"\n   {task.description}"
        formatted_tasks.append(task_str)

    return "\n\n".join(formatted_tasks)

def analyze_task_workload(tasks: List[TaskItem]) -> Dict[str, int]:
    """Analyze workload distribution by priority."""
    priority_counts = {"high": 0, "medium": 0, "low": 0}
    for task in tasks:
        priority_counts[task.priority.lower()] += 1
    return priority_counts

task_manager = Agent(
    name="TaskManager",
    instructions="""You help users manage their tasks effectively.
    Use the tools to create formatted task lists and analyze workloads.""",
    tools=[create_task_list, analyze_task_workload]
)

Error Handling in Tools

Implement robust error handling in your tools:

import requests
from typing import Optional

def fetch_api_data(url: str, timeout: int = 10) -> str:
    """Fetch data from an API with proper error handling."""
    try:
        response = requests.get(url, timeout=timeout)
        response.raise_for_status()
        return f"Successfully fetched data: {response.json()}"
    except requests.exceptions.Timeout:
        return "Error: Request timed out. Please try again later."
    except requests.exceptions.ConnectionError:
        return "Error: Unable to connect to the API. Check your internet connection."
    except requests.exceptions.HTTPError as e:
        return f"Error: HTTP {e.response.status_code} - {e.response.reason}"
    except Exception as e:
        return f"Error: Unexpected error occurred - {str(e)}"

api_agent = Agent(
    name="APIAgent",
    instructions="Help users fetch and analyze API data safely.",
    tools=[fetch_api_data]
)

Streaming Responses

For real-time applications, use streaming to get responses as they’re generated:

from agents import Agent, Runner

def stream_agent_response():
    agent = Agent(
        name="StreamingAssistant",
        instructions="Provide detailed, step-by-step explanations"
    )

    # Stream the response
    for chunk in Runner.run_stream(
        agent,
        "Explain how machine learning works in simple terms"
    ):
        if chunk.content:
            print(chunk.content, end='', flush=True)

        # Handle tool calls in streaming
        if chunk.tool_calls:
            print(f"\n[Tool called: {chunk.tool_calls[0].function.name}]")

# Run streaming example
stream_agent_response()

Streaming with Session Management

from agents import Session

def streaming_conversation():
    session = Session()
    agent = Agent(
        name="ConversationBot",
        instructions="Engage in natural conversation, building on context"
    )

    while True:
        user_input = input("\nYou: ")
        if user_input.lower() in ['quit', 'exit']:
            break

        print("Bot: ", end='')
        for chunk in Runner.run_stream(agent, user_input, session=session):
            if chunk.content:
                print(chunk.content, end='', flush=True)
        print()  # New line after response

# Start interactive conversation
# streaming_conversation()

Tracing and Debugging

Enable comprehensive tracing for debugging and monitoring:

from agents import Agent, Runner
from agents.tracing import enable_tracing

# Enable tracing
enable_tracing()

debug_agent = Agent(
    name="DebugAgent",
    instructions="Help debug code issues",
    tools=[analyze_code, suggest_fixes]
)

# Run with tracing enabled
result = Runner.run_sync(
    debug_agent,
    "This Python code has a bug: print('Hello World')"
)

# Tracing information is automatically captured
print(f"Trace ID: {result.trace_id}")
print(f"Total tokens used: {result.usage.total_tokens}")
print(f"Tools called: {len(result.tool_calls)}")

Custom Tracing Handlers

from agents.tracing import TraceHandler

class CustomTraceHandler(TraceHandler):
    def on_agent_start(self, agent_name: str, input_text: str):
        print(f"🤖 Agent {agent_name} starting with: {input_text[:50]}...")

    def on_tool_call(self, tool_name: str, arguments: dict):
        print(f"🔧 Calling tool: {tool_name} with {arguments}")

    def on_agent_complete(self, agent_name: str, output: str):
        print(f"✅ Agent {agent_name} completed")

# Use custom tracing
custom_handler = CustomTraceHandler()
enable_tracing(handler=custom_handler)

Guardrails Implementation

Implement sophisticated guardrails for safety and quality:

from typing import List
import re

def content_safety_check(message: str) -> bool:
    """Check if content meets safety guidelines."""
    unsafe_patterns = [
        r'\b(password|secret|api[_-]?key)\b',
        r'\b(hack|exploit|vulnerability)\b',
        r'\b(personal[_-]?info|ssn|credit[_-]?card)\b'
    ]

    message_lower = message.lower()
    for pattern in unsafe_patterns:
        if re.search(pattern, message_lower):
            return False
    return True

def response_quality_check(response: str) -> bool:
    """Ensure response meets quality standards."""
    if len(response.strip()) < 10:
        return False
    if response.count('?') > 5:  # Too many questions
        return False
    return True

def input_validation(user_input: str) -> bool:
    """Validate user input before processing."""
    if len(user_input) > 5000:  # Too long
        return False
    if not user_input.strip():  # Empty input
        return False
    return True

secure_agent = Agent(
    name="SecureAssistant",
    instructions="Provide helpful assistance while maintaining security",
    guardrails=[content_safety_check, response_quality_check, input_validation]
)

Context Management

Handle large contexts and memory efficiently:

from agents import Agent, Runner, Session

class ContextManager:
    def __init__(self, max_messages: int = 20):
        self.max_messages = max_messages

    def trim_session(self, session: Session) -> Session:
        """Keep only recent messages to manage context size."""
        if len(session.messages) > self.max_messages:
            # Keep system message and recent messages
            system_msgs = [msg for msg in session.messages if msg.role == 'system']
            recent_msgs = session.messages[-self.max_messages:]
            session.messages = system_msgs + recent_msgs
        return session

# Use context management
context_manager = ContextManager(max_messages=15)
session = Session()

long_conversation_agent = Agent(
    name="LongConversationBot",
    instructions="Maintain context in long conversations"
)

# In a loop, trim context periodically
for i in range(100):  # Simulate long conversation
    user_input = f"Message {i}: Tell me something interesting"

    # Trim context before processing
    session = context_manager.trim_session(session)

    result = Runner.run_sync(
        long_conversation_agent,
        user_input,
        session=session
    )

Production Best Practices

Configuration Management

import os
from dataclasses import dataclass
from typing import Optional

@dataclass
class AgentConfig:
    model: str = "gpt-4o"
    temperature: float = 0.7
    max_tokens: int = 1000
    timeout: int = 30
    max_retries: int = 3
    api_key: Optional[str] = None

    @classmethod
    def from_env(cls) -> 'AgentConfig':
        return cls(
            model=os.getenv('AGENT_MODEL', 'gpt-4o'),
            temperature=float(os.getenv('AGENT_TEMPERATURE', '0.7')),
            max_tokens=int(os.getenv('AGENT_MAX_TOKENS', '1000')),
            timeout=int(os.getenv('AGENT_TIMEOUT', '30')),
            max_retries=int(os.getenv('AGENT_MAX_RETRIES', '3')),
            api_key=os.getenv('OPENAI_API_KEY')
        )

# Use configuration
config = AgentConfig.from_env()
production_agent = Agent(
    name="ProductionAgent",
    instructions="Production-ready assistant",
    model=config.model,
    temperature=config.temperature,
    max_tokens=config.max_tokens
)

Monitoring and Logging

import logging
from datetime import datetime
from agents import Agent, Runner

# Set up logging
logging.basicConfig(
    level=logging.INFO,
    format='%(asctime)s - %(name)s - %(levelname)s - %(message)s'
)
logger = logging.getLogger(__name__)

class MonitoredAgent:
    def __init__(self, agent: Agent):
        self.agent = agent
        self.request_count = 0
        self.error_count = 0

    def run(self, message: str, session=None):
        self.request_count += 1
        start_time = datetime.now()

        try:
            logger.info(f"Processing request {self.request_count}: {message[:50]}...")
            result = Runner.run_sync(self.agent, message, session=session)

            duration = (datetime.now() - start_time).total_seconds()
            logger.info(f"Request completed in {duration:.2f}s")

            return result

        except Exception as e:
            self.error_count += 1
            logger.error(f"Request failed: {str(e)}")
            raise

    def get_stats(self):
        return {
            'total_requests': self.request_count,
            'total_errors': self.error_count,
            'error_rate': self.error_count / max(self.request_count, 1)
        }

# Use monitored agent
base_agent = Agent(name="MonitoredBot", instructions="Be helpful")
monitored_agent = MonitoredAgent(base_agent)

Testing Your Agents

import unittest
from agents import Agent, Runner

class TestAgentBehavior(unittest.TestCase):
    def setUp(self):
        self.agent = Agent(
            name="TestAgent",
            instructions="You are a helpful test assistant"
        )

    def test_basic_response(self):
        result = Runner.run_sync(self.agent, "Say hello")
        self.assertIn("hello", result.final_output.lower())

    def test_tool_usage(self):
        def test_tool(value: int) -> int:
            return value * 2

        agent_with_tool = Agent(
            name="ToolAgent",
            instructions="Use the test tool when needed",
            tools=[test_tool]
        )

        result = Runner.run_sync(agent_with_tool, "Double the number 5")
        self.assertEqual(len(result.tool_calls), 1)
        self.assertEqual(result.tool_calls[0].function.name, "test_tool")

# Run tests
if __name__ == '__main__':
    unittest.main()

Next Steps

You now have the knowledge to create sophisticated agents with advanced features. In the final part of this series, we’ll put everything together to build a real-world multi-agent system for creating AI courses.

Key Takeaways

Advanced configuration allows fine-tuning agent behavior and model selection
Sessions enable persistent conversations with automatic context management
Streaming provides real-time responses for interactive applications
Guardrails ensure safety and quality in production environments
Proper monitoring and testing are essential for production deployment
Context management prevents token limit issues in long conversations

Ready to see these concepts applied in a complex real-world scenario? Let’s move on to Part 3!