You’ve probably noticed by now that we have a lot of AI Agent/Agentic Python frameworks. To list a few – n8n, Pydantic AI, Crewai, OpenAI SDK, Microsoft’s AutoGen, LangChain, LangGraph, Google’s Agent Development Kit (ADK), and Amazon’s Strands, among others. Having played with a few of them, I can assure you all are good frameworks and might come down to your own taste, their fan base in the open source community, and, on occasion, some key feature one of them does better. Let’s dig into OpenAI Agent SDK in this blog.
What makes something “Agentic”?
There are probably many different “definitions” out there, but I will try to explain as clearly as I can. When I first heard the term “Agentic,” I was skeptical. It sounded like marketing speak (probably still is). However, after spending some time with these frameworks and hearing experts discuss them, I now have a better appreciation of what it is.
You can use any of the many frameworks to create a deterministic single- or multi-agent workflow. In this mode, we explicitly wire the agents together and orchestrate the entire process. This is absolutely valid and is probably what everyone is doing today. The key here is that the Agent workflow is not given full autonomy to reason and wire up the workflow as it deems necessary. On the other end, an agentic system can reason through problems, make plans, and execute them using tools, all without any human hand-holding. So we just throw into a box the different agents, the tools, pointers to any MCP services, and give the program a goal. It then decides how to do the work and in what order until it reaches the goal.
Let’s say we set the goal for an AI travel Agent, “Book me a 5-night vacation to NYC in December 2025 for 2 adults. Include a weekend in the plan. I need flights from SFO to NYC and prefer non-stop flights on United Airlines. My total budget per person is $1500. I want to stay within a 15-minute walk of Times Square and do not need a rental car. Would like you to book me a 4 hr max walking tour of NYC for any of the days when the weather is good (except the last day).”
We are used to doing this type of travel planning ourselves or sharing this with a human travel agent to book the trip. An agentic system will take this as the goal, search travel sites, read multiple sources for city tourist information, search for lights, search for hotel rooms, check weather conditions, and then synthesize the information and actually book the entire trip for us. It’s the difference between having a conversation and having a capable assistant that does it all.
Understanding the OpenAI Agent SDK
Let’s dig into some examples with OpenAI Agent SDK. For the sake of this blog, you need to be aware of the following pillars…
Agents are your AI workers. Think of them as experts with specific skills and access to certain tools. You give them instructions, and they figure out how to accomplish tasks.
Tools are the functions your agents can call. These are your bridges to the outside world, such as APIs, databases, file systems, whatever you need. The SDK makes it incredibly easy to turn any Python function into a tool that your agent can use.
Guardrails can be attached to either the input or the output of LLMs to ensure alignment with the moderation policies you may wish to enforce.
Runners are the execution engines. They handle all the messy details of running your agent, including calling tools, managing conversation flow, and addressing errors. You just tell them what to do, and they handle the rest.
HelloWorld with the OpenAI Agent SDK
Setting Up Your Environment
First things first, you’ll need UV if you don’t already have it. UV is a fast Python package manager that makes dependency management a breeze:
1 | curl -LsSf https://astral.sh/uv/install.sh | sh |
Next, you’ll need to set up your environment variables. Copy the .env.example file to a new .env file and add in the values. You will need to create an OPENAI API account and a Pushover account at https://pushover.net/.
1 2 3 4 5 6 7 8 9 10 | # OpenAI API Configuration OPENAI_API_KEY=your_openai_api_key_here # Pushover Notification Settings (for receiving alerts when users interact) PUSHOVER_TOKEN=your_pushover_app_token_here PUSHOVER_USER=your_pushover_user_key_here # RSS Feed URLs (optional - if not set, default example feeds will be used) BLOG_RSS_URL=https://yourblog.com/feed/ PODCAST_RSS_URL=https://yourpodcast.com/rss |
Install everything:
1 | uv sync |
Hello World Setup
Here’s what this looks like in practice:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 | from agents import Agent, function_tool, Runner import asyncio from datetime import datetime # Define a tool @function_tool def getcurrentdatetime() -> dict: """Returns the current date and time in human readable format""" now = datetime.now() date_time = now.strftime("%A, %B %d, %Y at %I:%M:%S %p") print("inside tool") return {"date_time": date_time} # Create an agent agent = Agent( name="My Agent", instructions="You are a helpful assistant", tools=[getcurrentdatetime], model="gpt-4o-mini" ) # Run the agent async def main(): result = await Runner.run(agent, "Who are you? And what is the current date and time?") print(result.final_output) if __name__ == "__main__": asyncio.run(main()) |
That’s it. No complex configuration, no boilerplate, no magic. Just define your tools, create your agent, and run it.
Building an AI-powered Chat Agent
Let’s build a slightly more complex example with a personal assistant that can answer questions about your professional background, retrieve your latest blog posts or podcast episodes, and even handle user interactions via a browser chat interface (using Gradio). It’s a good example of how the SDK handles more complexity without getting in your way. The chat interface can:
- Answer questions about your background and experience intelligently
- Discuss your latest blog posts and podcast episodes
- Recognize when it doesn’t know something and escalate to me
- Capture leads by collecting visitor contact information
- Notify us in real-time when interesting conversations happen
Architecture Overview
The application is built around three core components:
1. The Main Agent (Chat Agent)
At the heart of the system is a conversational agent powered by OpenAI’s GPT-4o-mini model. This agent personalizes responses with context about you through a carefully constructed system prompt that includes a summarized version of your resume (resume.pdf in the root folder).
The agent is initialized with a clear set of instructions: act as you, stay professional, answer questions about your background, and use specific tools when needed. When someone asks about your experience at a particular company or your technical skills, the agent draws from this embedded context to provide accurate, personalized responses.
2. Specialized Sub-Agents
Rather than trying to make a single agent handle everything, the system uses a specialized Content Summarizer Agent for a specific preprocessing task. Before the chat agent ever starts, this sub-agent processes your resume PDF to create a concise, clean summary.
This sub-agent has its own focused instructions: remove duplicate information, normalize whitespace, preserve all important details, and maintain readability. By delegating this specialized task to a purpose-built agent, the main chat agent receives well-formatted context without having to deal with quirks in raw PDF text extraction.
3. Function Tools as Capabilities
The real power of the Agents SDK comes from its function tools. These are Python functions decorated with @function_tool that extend the agent’s capabilities beyond conversation with the LLM. I built the following tools:
- RSS Feed Retrievers: Two tools fetch and parse blog and podcast RSS feeds. When someone asks, “What’s your latest blog post?” the agent can dynamically retrieve current information rather than relying on stale training data or responding with no data. The tools handle both standard RSS and Atom formats, extracting titles, links, descriptions, and podcast-specific metadata like episode duration.
- Unknown Question Recorder: This tool is invoked when the agent encounters a question it can’t answer in its current context. Instead of hallucinating or giving a generic “I don’t know” response, it sends you a push notification with the exact question.
- User Details Recorder: When conversations get interesting, and visitors want to connect, the agent can collect their contact information and immediately notify you via Pushover.
The Code Structure
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 | openaiagentsdk/ ├── me_chat.py # Main application entry point │ ├── MeChat class │ │ ├── Loads resume PDF │ │ ├── Calls Content Summarizer Agent │ │ ├── Builds system prompt with context │ │ ├── Initializes Chat Agent with tools │ │ └── Launches Gradio interface │ ├── my_agents/ │ ├── __init__.py │ └── content_summarizer_agent.py # Resume preprocessing agent │ ├── summarize_content tool │ └── Agent(instructions, tools, model) │ ├── tools/ │ ├── __init__.py │ ├── rss_retriever_tool.py # Content fetching tools │ │ ├── @function_tool get_blog_rss_feed() │ │ ├── @function_tool get_podcast_rss_feed() │ │ └── _get_rss_feed() helper │ │ │ └── push_notification_tool.py # Notification tools │ ├── @function_tool record_unknown_question() │ ├── @function_tool record_user_details() │ └── _send_push_notification() helper │ ├── utils/ │ └── pushover.py # Pushover API wrapper │ ├── input_guardrails.py # Input validation guardrails │ └── resume.pdf # Your resume content |
Each agent has a single job and performs it well. The main chat agent, in me_chat.py, orchestrates everything, but when it needs to fetch blog or podcast information, it delegates to the RSS retriever tool. When it needs to summarize content, it calls the summarizer Agent. We let the Chat agent decide when to call the RSS retriever or the push notification tool.
How It All Wires Together
The initialization sequence reveals how these components orchestrate:
- The application loads configuration from environment variables (API keys, RSS feed URLs, Pushover credentials)
- The resume PDF is loaded using PyPDF, then passed to the Content Summarizer Agent via the Runner.run() async interface. This returns a clean, condensed version of my professional background.
- The summarized resume is injected into a carefully crafted system prompt that defines the agent’s persona, tone, and behavior guidelines. This prompt instructs the agent to stay in character as me, be professional, and know when to use its tools.
- The leading Chat Agent is created with its instructions and tools. The OpenAI Agents SDK handles the complex orchestration of deciding when to call these tools versus when to respond conversationally.
- utils/input_guardrails.py defines 3 input guardrails that are then attached to the Chat Agent in me_chat.py. In our example, all user inputs are automatically validated through three guardrails:
- Content Moderation: Uses OpenAI Moderation API to block inappropriate content (hate speech, harassment, violence, etc.).
- Length Validation: Ensures messages don’t exceed 10,000 characters.
- Format Validation: Checks for valid UTF-8 encoding and non-empty inputs
- Gradio Interface: A simple chat interface is launched using Gradio, which handles all the web UI complexity.
Running the Application
- Install dependencies using uv (a fast Python package manager) or standard pip
- Create an account at https://pushover.net/
- Configure environment variables for OpenAI API access and Pushover credentials
- Add your resume PDF to the project root directory as resume.pdf (or use the sample I mocked up from ChatGPT)
- Customize the configuration constants (your name, model choice, server port). Defaults should work with the name set to John Doe.
- Run uv run python me_chat.py
- To test the moderation guardrail, submit an input you consider inappropriate, and the guardrail will enforce it.
When the application starts, it loads your resume, summarizes it, and initializes the agent with guardrails & tools. It then launches a Gradio web interface at localhost:7860. The initialization takes less than a minute, with most of the time spent reading the PDF and generating the AI summary.
The Git code repository for this blog is at https://github.com/thomasma/openaiagentsdk