What an AI agent really is and when it's worth it

Over the last twelve months, "AI agent" has become the phrase vendors slap at the end of any proposal to bump the price by 30%. There are agents in marketing decks, agents in RFPs, agents on LinkedIn, agents wherever there used to be a chatbot or a script. And, frankly, most of them are not agents in any strict sense.

Like any fad, separating signal from noise starts with a useful definition and some criteria for when it's worth it. That's what I'm aiming for here.

A practical definition of an agent

An AI agent is a system that, given a goal, decides on its own what steps to take, in what order, and with what tools, and adjusts the plan based on what it observes. Four key elements: a goal, autonomous decision-making, use of external tools, and an observation-action loop.

To make it concrete with an example: a chatbot answers questions. A copilot helps you while you drive. An agent receives "find the 20 most promising leads in my pipeline and prepare a personalized email for each", opens the CRM, reads, searches LinkedIn, writes the drafts and leaves them in your drafts folder. You didn't tell it the steps. It figured them out.

What separates a real agent from a script in costume

A script with AI sticks a language model into a fixed step. An agent decides the sequence. If your "agent" always runs the same steps in the same order, it's not an agent: it's a workflow with an LLM inside. That's not bad per se, but paying agent prices for it is bad business.

The mental test: if you remove the word "agent" and describe what it does, does the price still make sense? If not, you're being sold a label.

The spectrum: chatbot, copilot, agent

It's a continuum, not a binary line. Worth situating yourself:

Chatbot: answers within a closed domain using rules or AI.
Copilot: works alongside you on a task — you drive, it suggests.
Agent: executes tasks with some autonomy, decides steps, uses tools.
Multi-agent: several agents coordinating with each other. This is where the risk explodes.

The choice between one and another isn't about power: it's about control. The further right on the spectrum, the more capability to act and the more capability to mess up without you catching it in time.

When an agent makes sense

An agent is worth it when at least three of these five conditions hold:

1. The task has variable steps

If the steps are always the same, a traditional workflow with AI in some steps is cheaper and more reliable. Agents shine when the decision about what to do next depends on context.

2. The cost of being wrong is low or reversible

An agent that organizes your inbox can make mistakes you fix in 30 seconds. An agent that approves vendor payments can make mistakes that cost real money or reputation. The capability of the tool matters less than the cost of failure.

3. There's a human in the loop at critical points

Serious agents include checkpoints where a human validates before continuing. Not "human at the end just in case". Human at the points where the decision changes course.

4. The volume justifies the complexity

An agent costs more to design, more to maintain and more to monitor than a script. If you're going to use it five times a month, it doesn't pay off. If it'll process hundreds of cases a day, it does.

5. The vendor or your team understands its behavior under load

An agent that works perfectly in demo and starts wandering with edge cases is the nightmare I've seen most. If no one in the room can explain what it does when it gets stuck, don't put it in production.

When an agent is NOT the right choice

Knowing when to walk away matters just as much.

When you want total predictability

Regulated processes, financial flows under audit, legal decisions: variance is the killer here. An agent can take different paths for similar cases. That's wonderful for creativity and poison for compliance. In my time in banking at ING, this was the red line: regulated work isn't delegated to systems that decide on their own.

When the knowledge base is bad

If your wiki, CRM or data warehouse are dirty, an agent amplifies the chaos at higher speed. Before adding an agent, fix the source.

When the team isn't ready to supervise

An agent without supervision is a loose intern in production. If you don't have continuous review capacity, better a copilot that only acts when you decide.

Cases where I see clear return

After several rollouts, these are the niches where I've seen agents pay their bill without argument.

Pre-call research for sales

An agent that, before each call, gathers public account context, reviews CRM history and prepares a one-page brief. Saves between 20 and 40 minutes per call. With a team of 15 sellers, those numbers add up by month-end.

Ticket triage and intelligent routing

In support, an agent that classifies incoming tickets, finds similar past tickets and proposes a first response. Decides whether to escalate, route or auto-close with a template. The human still reviews, but their time concentrates on non-trivial cases.

Low-risk reconciliations

Cross-checking invoices with orders, identifying discrepancies, proposing adjustments. The agent prepares, the human signs off. The leverage here is enormous when volume is high.

Continuous competitive research

An agent that follows 20 competitors, reads their blogs and press releases, detects pricing changes or launches, and delivers a weekly summary. It doesn't replace product or marketing, but it saves them hours of scrolling.

The real cost of an agent in production

The license price is the tip of the iceberg. What an agent really costs:

Designing and modeling the behavior (weeks, not days).
Connecting to internal systems with proper permissions and auditing.
Specific monitoring: latency, cost per execution, failure rate.
A dedicated person at least half-time for the first three months.
A degradation plan for when the model API changes or fails.

When you add it all up, a serious agent in a mid-sized company costs five to ten times what a well-implemented copilot would cost the same team. If the use case justifies that investment, go ahead. If not, drop down to a copilot and sleep well.

The mistake I see most often

The mistake I see most often is jumping straight to multi-agent because it sounds ambitious. Teams that didn't have a single stable agent set up three agents talking to each other, then spend six months debugging infinite loops, cross-hallucinations and costs that spike for no clear reason.

The rule I apply: nobody should set up a multi-agent system before having had at least one individual agent in production for six months with stable metrics. The combinatorial complexity of several agents interacting is brutal and very few people understand what's going on inside when something breaks.

An agent, well chosen and well governed, can be the piece that changes the economics of an entire process. But an agent put in place out of fashion is the most expensive way to not solve a problem a copilot would have solved in a month. The question isn't "do I want an agent?". The question is "what decision do I need to automate, and what happens if it gets it wrong?". Starting there filters out 80% of the useless conversations.