Pain Signals for Agentic Systems, sharp pain, dull ache, and the operational limp

Written by Tony Wood | Jan 10, 2026 9:53:33 AM

Because I am still in this series about how human emotions can be related to agentics. I have been playing with dreaming, surprise, shame, curiosity and distrust, and I keep coming back to the same thing. If agentic systems are going to make work easier day to day, then we need metaphors and interfaces that people can feel in their bones. Not more dashboards and error codes.

So in this one I want to introduce the idea of pain.

I am not making a medical claim here, I am borrowing a biological pattern and turning it into an engineering pattern. Pain is one of the clearest examples we have of a signalling system that drives action.

Source: National Library of Medicine, MedlinePlus
"Pain is a signal in your nervous system that something may be wrong. It is an unpleasant feeling, such as a prick, tingle, sting, burn, or ache."

That is the whole point of this post. Pain is a signal. It routes attention. It prioritises. It changes behaviour.

The metaphor, sharp pain vs dull ache

In the body, a sharp pain is basically, stop doing something now because it is hurting. A dull ache is different. It is persistent. It changes how you move. You protect the area, sometimes without thinking about it.

In agentic systems, I think we can map that cleanly:

Sharp pain maps to an incident now signal.
Dull ache maps to an ongoing issue signal that triggers protective behaviour, what I call the limp.

This is useful because it creates a shared language between people and systems. Humans already understand what sharp pain and dull ache mean. We do not have to train everyone to speak in severity codes to get the message across.

Sharp pain, how to route incidents to an incident management agentic

Sharp pain in agentics is, something is broken, something is not working, something is unsafe, or something is outside bounds.

The key thing is not detection. The key thing is what happens next. In a good team, sharp pain triggers incident management. In an agentic system, sharp pain should trigger an incident management agentic, a dedicated capability that can triage, contain, communicate, and coordinate.

Source: Andrew Stribblehill, Google SRE Book
"Effective incident management is key to limiting the disruption caused by an incident and restoring normal business operations as quickly as possible."

So the sharp pain pattern looks like this:

Detect failure or boundary breach
Emit a sharp pain signal with enough context
Route to the incident management agentic
Contain impact (stop the workflow, isolate a dependency, degrade safely)
Communicate to humans in plain language
Record what happened so we can learn

You are not trying to replace humans. You are trying to ensure the system reacts predictably and quickly, and keeps people in the loop with words they can act on.

Dull ache, persistent risk and the operational limp

Then there is the dull ache. This is the stuff that does not page you at 3am, but slowly drains time, trust, and attention.

In business systems it looks like:

A workflow that works most of the time but fails in edge cases
A dependency that is flaky, not dead
A routine that needs manual intervention every week
A subsystem that the team already distrusts

In the body, a dull ache gives you a limp. You still move, but you protect the area. In an agentic system, the limp is deliberate protective behaviour that stays in place while the ache persists.

Examples of limp behaviours in an agentic system (design hypotheses, not fixed rules):

Reduce automation on a risky path
Add extra verification steps
Slow down the rate of change around the subsystem
Increase sampling and monitoring signals
Route approvals to humans more often
Avoid the shaky route until confidence improves

This is where observability becomes the nervous system. You want signals that let you interrogate what is happening from the outside.

Source: OpenTelemetry Authors, OpenTelemetry
"Observability lets you understand a system from the outside by letting you ask questions about that system without knowing its inner workings."

The pattern, a pain signal as an internal event with routing metadata

Here is the actionable bit. If you want to teach your agentic system to feel pain, you need two layers:

1) Human language so people can understand quickly
2) Machine readable metadata so agents can route and respond consistently

I think the simplest operational definition is:

A pain signal is an internal event emitted by agents, described in human terms, backed by structured routing metadata.

Pain signal vocabulary (human layer)

Keep the surface language human and consistent. For example:

Sharp pain: urgent, stop and investigate
Dull ache: persistent, protect and improve
Optional descriptors for operators: annoying, worrying, horrible, surprising

The point is not theatrics. The point is shared context at speed.

Routing metadata (machine layer)

Under the hood, attach fields that let other agents act without guessing:

pain_type: sharp, dull
area: subsystem, workflow, capability
persistence: new, recurring, continuous
confidence: low, medium, high (or a numeric score if you already use one)
ownership: which agent or team owns response
suggested_next_action: stop, investigate, reduce load, add checks, request human confirmation
blast_radius_hint: what might be impacted (if known)

You can store this as state, publish it as an event, or both. The important part is that it is consistent and routable.

A checklist you can run this week

This is the small experiment I would run with a team. No big rewrite required.

1) Define five pain signals

Pick five common failure modes or risks, for example:

External dependency timeout
Data validation failure
Suspicious authentication pattern
Repeated retries above threshold
Known flaky integration

2) Decide sharp vs dull for each one

Be strict:

If it needs immediate containment, it is sharp
If it needs a limp and a ticket, it is dull

3) Write the human language message

Make it sound like a teammate, not a log line. Include:

What happened
Where it happened
What the system has already done
What it explains in plain language
What you need from the human, if anything

4) Attach the routing metadata

Use a consistent schema across agents. If you do this, you will reduce arguments later.

5) Route to the right responder

Sharp pain routes to the incident management agentic (and your on call workflow)
Dull ache routes to an improvement agentic, and triggers the limp behaviour

6) Test the limp

Pick one dull ache and deliberately enforce protection for a week. Then evaluate:

Did it reduce failures?
Did it increase cost or latency?
Did it restore trust, or expose deeper issues?

7) Review and learn

After a sharp pain incident, decide what becomes a dull ache, and what gets removed entirely. After a dull ache fix, decide what trust is restored and what monitoring stays.

What I am still exploring (open questions, not claims)

A few things I want to go deeper on next:

What established research exists on pain like signalling in autonomous systems and resilience engineering (I suspect it is out there, I have not collated it yet).
How to map dull ache cleanly onto existing service management habits, for example, SLOs (service level objectives) and operational debt, without renaming everything and confusing the team.
Whether there is a clean open specification we should align to for event shaped pain signals between agents.

If we do use eventing for this, having a shared way to describe event data matters.

Source: CloudEvents Authors, CloudEvents
"CloudEvents is a specification for describing event data in a common way."

Where I have landed (for now)

Pain is a warning system and a coordination system. When we map it into agentic systems, it becomes a practical design pattern:

Sharp pain is an incident level signal, stop and investigate, route to incident management.
Dull ache is persistent risk, keep working but protect the area, adopt a limp until the underlying issue is fixed.
Human language on top, structured metadata underneath keeps the system relatable without losing operational clarity.

If you try this, start small. Define five signals, route them, test the incident agentic, test the limp. Then tell me what broke, what surprised you, and what words your team naturally used. That is where the good patterns come from.

Quotes

"Pain is a signal in your nervous system that something may be wrong. It is an unpleasant feeling, such as a prick, tingle, sting, burn, or ache." (National Library of Medicine, MedlinePlus)
"Observability lets you understand a system from the outside by letting you ask questions about that system without knowing its inner workings." (OpenTelemetry Authors, OpenTelemetry)
"Effective incident management is key to limiting the disruption caused by an incident and restoring normal business operations as quickly as possible." (Andrew Stribblehill, Google SRE Book)

#Educate #Agentics #Observability #IncidentManagement #SRE #HumanCentredDesign #AIOps #SystemsThinking

View full post