The three components, no jargon
An AI agent is three things working together: a language model that can think, a set of tools it can call, and some memory of what's happened so far. That's it. Strip away the framework marketing and you're left with a loop — the model reads context, picks a tool, runs it, reads the result, decides what to do next.
Everything else — LangGraph, AutoGen, CrewAI, OpenAI's Agent SDK, Anthropic's tool use API — is plumbing. The same plumbing solves the same problem: how to wire those three things together reliably.
For founders, this matters because the agent isn't the moat. The model is borrowed (OpenAI's, Anthropic's, Google's). The framework is open source. What's yours is: the tools you give the agent access to, the data it remembers, and the guardrails that keep it from doing dumb things.
In other words, what you actually pay a studio to build isn't "an agent." It's an integration layer between your data, your tools, and someone else's model.
When you need an agent (and when you don't)
You need an agent when:
- The task requires more than one step, and the steps depend on each other's results
- Each step can fail in interesting ways and needs reasoning to recover
- The output isn't deterministic — it's "good enough" or "best fit," not "exactly this"
You don't need an agent when:
- The workflow is the same every time → that's a Zapier or n8n automation
- The task has one input and one output → that's a single LLM call
- The reasoning is trivial → use rules
A common mistake: building "an agent" for what's really a scheduled script. If you can write the steps in advance as a flowchart, you're describing an automation, not an agent.
The five patterns we keep building
- Lead qualifier. Incoming form → enrich company data → score → route to Slack or CRM. The agent decides what to ask if data is missing.
- Internal search assistant. Question in → searches your knowledge base, Notion, Slack history, Drive → synthesizes an answer with citations.
- Customer support triage. Email or chat in → classify intent → draft response or escalate. Memory of past customer interactions improves accuracy week over week.
- Outbound research. Given a list of companies, the agent visits each site, extracts target signals, builds a short brief per company.
- Document processor. PDF or contract in → extract structured data, flag clauses, generate summary. Often the simplest commercial agent — and the highest ROI.
What it actually costs to run
Per-interaction cost (model + tools) for the patterns above:
| Pattern | Avg cost / run | Typical volume / month |
|---|---|---|
| Lead qualifier | $0.04–$0.12 | 1k–10k |
| Internal search | $0.08–$0.20 | hundreds–thousands |
| Support triage | $0.06–$0.15 | 10k+ |
| Outbound research | $0.30–$1.20 | 500–5k |
| Document processor | $0.15–$0.80 | task-dependent |
Token prices as of Q2 2026. We default to Anthropic Claude Sonnet 4.6 — cheaper alternatives for high-volume, more capable models for high-stakes.
The cost that surprises clients isn't the model — it's the integration work. Wiring an agent into HubSpot, Slack, and your data warehouse takes engineering hours. Budget 80% engineering, 20% prompt and model work for the first build.
How to start without committing to a platform
Don't pick a framework first. Pick a use case where you can measure the result.
- Find a workflow your team does every week, manually, that takes 30+ minutes
- Time it for two weeks. Record what makes each instance different.
- Build the dumbest version that works — one LLM call, one tool — see if it gets you 60% of the way there
- Add a second tool. Add memory. Now you have an agent.
This bottom-up approach saves you from buying a framework before you know what you need. We've seen teams spend $20k on Crew, Lang, or whatever before realizing their actual need was a 50-line Python script.
The first agent takes 4–6 weeks end to end. The second one takes 30% of the time. The third, 15%. That's where compounding starts.
Once you've shipped one agent end to end, the patterns become reusable: the data plumbing, the eval harness, the deployment template. New use cases turn into days, not months.