Learning to Build Agents

Last week I took Google’s Five-Day AI Agents Intensive Course, a program created to introduce students to the foundations of AI agents and the choices involved in designing and deploying them. I joined because I wanted to understand not only the technical aspects, but also how a product manager should think when planning agent-based solutions.

Throughout the week, I realised that agents behave very differently from traditional software, and that this shift requires a new way of thinking about goals, evaluation, safety, behaviour, and product strategy.

In this post I share the concepts I found useful and how I plan to apply them when designing agents.

Introduction to Agents

An agent is not just a system that responds to a prompt. It is a loop that thinks, takes actions through tools, observes the results and then continues until it reaches a goal. It does not follow a predetermined script. It decides its next step based on the information available at that moment.

For a product manager, this is very relevant because it means that designing an agent is not only about defining a user flow. It also requires thinking about the environment in which the agent will operate: what information it receives, what tools it can access, and how we guide its behaviour so the system works as expected.

Tools: Giving Agents the Ability to Act

There are two ideas that are essential for anyone building agents.

The first one is tool use. The model provides the reasoning, but tools allow the agent to act. A tool can be an API call, a search operation, a database query, a code function or even a request for human confirmation. Without tools, an agent cannot do anything beyond generating text. Once you give it tools, it can complete tasks and move toward real outcomes.

The second idea is MCP, the Model Context Protocol. Before the course, I didn’t know much about it. Now I see how helpful it is for simplifying the connection between systems and giving agents a predictable way to access capabilities. MCP promotes a unified approach to integrations across teams, which is very valuable for long-term projects. It helps avoid isolated solutions and encourages a more coherent structure when designing agents that need to communicate with different tools and services.

Sessions and Memory

A session contains the current interaction and any temporary information the agent needs at that moment. Memory, by contrast, stores information across interactions.

As product managers, we have to decide which information the agent should remember, which information it should ignore and why. If the agent stores too much, the system can become slower or confused. If it stores too little, it can lose the thread of the conversation and force the user to repeat themselves, which negatively affects the user experience.

Memory also raises questions about privacy and trust. Anything the agent remembers must have a clear purpose and be aligned with user expectations and safety requirements. I realised that memory cannot be an afterthought. It must be included in the design from the start.

Quality, Evaluation and Traces

Traditional software is tested with fix expectations, but agents behave differently from one situation to another. This means we need new ways of assessing quality. Instead of checking a single expected output, we need scenarios that reflect real user intentions and judge the quality of the agent’s behaviour across them.

One concept I found extremely useful is the idea of using another model as a judge. This allows teams to evaluate large numbers of scenarios quickly and understand whether a new version improves or worsens the agent.

Along with this, I discovered traces. Traces show the reasoning steps the agent followed, what it attempted, what tool it used, what response it received and why it decided to move in a certain direction. This is incredibly valuable because it gives visibility into how the agent thinks. Instead of guessing why something went wrong, traces help you see the exact moment where the chain of reasoning started to drift.

I now see evaluation, monitoring and observation as core parts of the product rather than optional tasks.

From Prototype to Production

Another important idea from the course is the difference between building a prototype and preparing an agent for real users. A quick prototype is easy. A production-ready agent requires much more attention.

Before deployment, the agent needs automated evaluation to ensure that recent changes have not introduced new problems. It also requires a proper staging environment, careful release processes and systems to observe performance and behaviour in real time.

Security is essential as well. Agents can make decisions and call external systems, so we must ensure they never take actions outside their intended scope. This means defining clear rules for what the agent can do, which data it can access, when it must involve a human and how to protect the system from unexpected input.

Releasing an agent is only the beginning of its lifecycle. It will continue learning, changing and requiring supervision.

How I’m Applying This: My First Agent Project

While taking the course, I started working on my first agent. I chose a documentation assistant because I worked as a technical writer in the past, and I know how much users rely on documentation when trying to understand a product. Even with good interfaces, people still need guidance. Unfortunately, documentation is often static and cannot adapt to the user’s context.

An agent can change this. My goal is to create an assistant that reads the documentation, understands the user’s context and provides personalised guidance. I plan to use RAG to retrieve accurate information from the documentation, sessions to maintain the conversation and memory to avoid repeating explanations and understand long-term intentions.

This project is still at an early stage, but thanks to the course I now know what to consider from the beginning: clear goals, evaluation methods, memory decisions, safety rules, traces, monitoring and long-term governance.

Final Thoughts

This course has been an encouraging and eye-opening experience. Designing agents requires a new mindset that combines product strategy, safety, evaluation and continuous improvement. Besides defining what the user wants we also need to understand what the agent needs to work well.

I’m still learning, and I expect to learn much more as I build my first agent. I hope this overview helps others who are exploring this field. I will keep sharing my progress as I continue building and learning.