
Maintaining bug zero with our Red Team agent

This is the first post in a multi-part series on what we’re learning from building agents into our own production workflows.
We hit bug zero in August 2025. We made hitting bug zero a priority because bugs erode trust, and as a financial operating system we're in the business of trust. Our internal process for maintaining bug zero is Red Team, a rotation where one engineer is responsible for triaging and fixing or delegating issues as they arise.
The bottleneck for Red Team is more often context than code. Many bugs are easy to fix once they're reproduced and understood. The hard part is turning a screenshot, customer report, Sentry alert, or vague Linear task into a clear, reproducible diagnosis.
We were already using agents manually for that kind of work: searching across tools, inspecting code, reading logs, and pulling together the context needed to understand an issue. So this was the natural first place to apply an agent: taking messy inputs and turning them into something Red Team could act on faster.
We approached this problem like we do any problem - by aggressively descoping and shipping something quickly in order to iterate in production.
Gerard, our Red Team agent
Gerard is a Centaur agent whose sole purpose (today) is to make Red Team more efficient. It finds Linear issues with specific labels, gathers context from our tools, and posts the results back in Linear for a person to review.
It has access to the same tools we do: CloudWatch, Sentry, Grafana, Amplitude, Notion, sanitized production DB replica, and more. And it lives in the same places we do: Linear, Slack, and Discord. The system is intentionally simple, which lets us iterate quickly.
Right now Gerard is focused on two types of issues: bugs and investigations. Bugs get shaped directly into the issue body in Linear, since the issue itself becomes the source of truth. For investigations, Gerard adds comments to the issue, since the goal is to start a conversation and Linear's comments feature is designed for just that.
An ideal Linear task
In order to iterate towards a goal, we need to first define the goal. Since the goal here is not to solve the issue end-to-end but instead to provide context for a person, we needed to understand what an ideal Linear task looks like for a person.
We observed where Gerard struggled early on and used those missteps to craft a doc that gives Gerard and people a shared target: a clear title, repro steps, root cause, owners, priorities, success criteria, and enough context to act.
All intelligence requires feedback, and Gerard is no different. The eval feedback loop is simple: Gerard asks for feedback when it posts, and people provide it when they review the issue. It's clear that bug shaping is already quite useful. Early results show investigations need access to more tools.
Agent in the loop
Our goal right now is not to replace people with agents on Red Team. It certainly feels likely we'll eventually get there, but sequencing is strategy, so we start by figuring out where new technology can remove friction.
Today that means context gathering and shaping, so people stay focused on what people do best: judgment, empathy, and connection with those we serve.
Over time, more of the loop will likely compress, with Gerard taking on more responsibility. But it's early, and Gerard is still earning that right.