Rovonn Russell/ Blog

Systems

What I Learned Building 15 AI Agents for a Storytelling Company

Fifteen agents in, here is what actually held up — and the three architectural mistakes I had to unlearn before any of it became reliable enough to run a business on.

Rovonn Russell5 min read
CategorySystems
Updated
Read Time5 min read

After shipping 15 production AI agents at Impact Loop, the single most important lesson is this: the agent should be the dumbest part of the system, not the smartest. Push every decision you can into deterministic scripts, and let the LLM only do the parts that genuinely require judgment.

The math nobody talks about

If a single LLM call is right 90% of the time, and your workflow chains five of them together, your end-to-end success rate is 0.9⁵ = 59%. The fix isn't a smarter model. The fix is fewer LLM decisions per workflow.

The three mistakes I had to unlearn

Mistake 1: One big mega-prompt that did everything. My first agent tried to scrape leads, classify them, enrich them, draft an email, and send it — all inside one monster system prompt. I rebuilt it as five tiny scripts and the success rate jumped past 95%.

Mistake 2: Letting the agent format its own output. Every modern agent should be calling tools with structured JSON, not 'please respond in this format.'

Mistake 3: No self-annealing loop. The breakthrough was treating every failure as a chance to update the instructions file the agent reads, so the next run is permanently smarter.

What an actually-reliable agent looks like

They have a tiny instructions file that tells the LLM what to do. They have a scripts/ folder full of deterministic Python that does how. The LLM picks which script to run, the scripts do the work. That separation is the entire game.

Want help thinking through your AI build?

If you're trying to figure out where AI agents actually fit in your business — and where they're going to waste your money — let's talk.

Get the Visibility Starter Kit

Frequently Asked Questions

Should you build AI agents with one big prompt or many small steps?

Many small deterministic steps. A single agent making 5 decisions in a row at 90% accuracy each only succeeds 59% of the time.

What is the biggest mistake people make when building their first AI agent?

Trying to make the agent do everything in one long context. The agents that actually work are the ones where the LLM is doing only the parts a script can't do.

Ready to work with Rovonn Russell?

Work With Rovonn