Skip to content

Shepherds of Agentic Sheep

Tony Wood |

The Leadership Playbook for Scaling AI Without Losing Control

I keep seeing the same pattern.

A leader starts with one helpful AI agent doing a narrow task.

Then a few more appear.

Then suddenly there is a small digital workforce producing drafts, decisions, customer messages, refunds, tickets, summaries, follow-ups, and process changes.

That is when the real question arrives:

Who is actually responsible when an agent gets it wrong?

This post is about that moment.

linkedin_visualization-Jan-27-2026-10-36-00-6360-AM


The metaphor, and why it matters

Managing agentic systems looks a lot like managing people.

  • You set expectations.

  • You give boundaries.

  • You review work.

  • You coach.

     

You decide what happens when things go well and what happens when they do not.

At the beginning, you can keep a close eye on everything.

That works with one agent.
It even works with three.

But the moment you want agents across sales, finance, HR, service, and operations, you hit the same ceiling you always hit with humans.

Your attention does not scale.
Your focus does not scale.

So you need layers.

You need shepherds.


What “agentic sheep” means in plain English

When I say agentic sheep, I mean AI agents that can:

  • Take a goal
  • Plan steps
  • Use tools such as email, CRM, ticketing, spreadsheets, or finance systems
  • Execute actions
  • Report what they did

This is powerful.

It is also operationally dangerous if nobody is watching the right things.

Not watching everything.

Watching the right things.


The leadership bottleneck is focus, not technology

Most agent programmes fail in a very boring way.

Not because the model is bad.

Because leaders become the human router for every edge case.

They become the escalation layer for vague instructions.
They become the approval gate for every risky output.

The workflow problem turns into a leadership bandwidth problem.

Modern work is already full of interruptions.
If your operating model assumes leaders can simply “keep an eye on it”, you are already in trouble.

This is why scaling agentic systems is an operating model challenge first, not a tooling challenge.


Span of control comes back, but louder

Span of control breaks when one manager has too many direct reports to lead well.

Agents recreate the same dynamic, only faster.

Agents produce more work, more quickly, and often with more confidence than is warranted. That creates more output, more exceptions, and more review pressure.

Flattening without shepherding means leaders drown in escalations.

The answer is not more dashboards.

The answer is a structured oversight layer.

This is where shepherds come in.


What a shepherd actually is

A shepherd is a supervising layer that:

  • Reviews what a set of agents did
  • Checks outputs against agreed quality and risk rules
  • Escalates only what matters
  • Feeds learning back so the system improves over time

A shepherd can be:

  • A human with a structured review cadence
  • An AI agent designed to supervise other agents
  • A hybrid of the two, which is where most organisations will land

This is not about replacing leaders.

It is about protecting them.


Two decisions leaders must make up front

1. Where humans should spend attention

If everything needs attention, you have built nothing.

Decide where humans add disproportionate value:

  • Risky decisions
  • Ambiguous customer situations
  • Money movement
  • Legal commitments
  • Identity and access changes
  • Sensitive data use
  • External communication under your brand

Everything else should default to agent execution with shepherded review.

2. What “bad” looks like

Most teams define success.

Very few define failure.

Agents are excellent at being confidently wrong.

So define what bad looks like. Write it down. Make it operational. Make it testable.


Governance is not optional

This line should be printed and pinned wherever your agent programme is being designed:

“The organizations making the most progress are treating AI agents as part of the workforce. They define roles, boundaries, escalation paths, and consequences. They invest as much in governance and monitoring as they do in model capability.”
Natarajan Elayappan

That is the point of shepherding.

Without governance you do not get scale.

You get chaos at speed.


Use RACI because it forces clarity

RACI is not fashionable. It is effective.

  • Responsible: who does the work
  • Accountable: who owns the outcome
  • Consulted: who must be asked
  • Informed: who must be told

If you do not define this for an agentic workflow, your organisation will invent it under pressure.

That is when mistakes happen.

Example: agentic refund workflow

  • Responsible: Refund Agent
  • Accountable: Head of Customer Operations
  • Consulted: Finance Controller
  • Informed: Support Team Lead

Add the shepherd:

  • Responsible: Refund Agent (does)
  • Responsible: Shepherd Agent (checks)
  • Accountable: Head of Customer Operations

If your shepherd cannot explain the RACI, it is not a shepherd.

It is a second agent guessing.


Definition of Ready, Done, and “Good Looks Like”

This is where most teams struggle.

They tell an agent to “handle invoices” or “manage enquiries” and are surprised when results vary.

You need three definitions.

Definition of Ready

A task is ready when:

  • The goal is explicit
  • Required data is present and permitted
  • Tools are available
  • Policy constraints are clear
  • Escalation triggers are defined

Definition of Done

A task is done when:

  • Outputs are complete
  • Sources are recorded
  • Actions are logged
  • The right people are informed
  • A rollback path exists

Good looks like

Good looks like:

  • Correct and policy-compliant
  • Appropriate for the audience
  • Minimal risk exposure
  • Measurable improvement

This is not paperwork.

This is how scale becomes safe.


Guardrails let you move faster without hurting people

Unchecked autonomy creates incidents.
Incidents create shutdowns.

Guardrails let autonomy exist without damage.

“The real breakthrough lies in finding a balance between harnessing this power and implementing robust safety measures and governance.”
Merve Ayyüce KIZRAK

Guardrails are not rules for the model.

They are rules for the organisation.

They define:

  • Tool access
  • Data visibility
  • Human approval thresholds
  • Logging requirements
  • Review expectations
  • Stop conditions

The scaling pattern leaders can actually run

Stage 1: One agent, one human shepherd

  • Heavy human review
  • Fast learning of failure modes
  • Logging and metrics established

Stage 2: Many agents, one human shepherd with a shepherd agent

  • First-pass review by AI
  • Human reviews only flagged cases
  • Escalation paths formalised

Stage 3: Many agents, many shepherds, humans on exceptions only

  • Shepherds supervise flocks
  • Humans approve policy and handle high-impact cases
  • Incidents are routine, not crises

As Ian Walker puts it:

“Human-AI teaming transforms span of control from a fixed rule into a strategic variable.”

Only if you design for it.


Twelve practical shepherding moves this quarter

Each of these is small, testable, and reversible.

  1. Name an owner for every agentic workflow
  2. Write a one-page RACI for each workflow
  3. Define explicit stop conditions
  4. Set human-in-the-loop triggers for high-impact actions
  5. Apply least-privilege access
  6. Minimise data by default
  7. Build a simple evaluation harness
  8. Log decisions, tool calls, and versions
  9. Apply spend, rate, and blast-radius limits
  10. Create a short incident playbook
  11. Run blameless postmortems
  12. Do a dignity and fairness check on outputs

The quiet truth

Good leadership is knowing what not to do.

When shepherding works, leaders stop being doers and become designers of decision-making.

They protect customers, colleagues, and the organisation.

That shows up as practical choices, grounded in basic decency:

  • Do no harm
  • Protect privacy
  • Be transparent
  • Treat people with dignity
  • Be accountable
  • Avoid exploitation
  • Act with restraint

These are not abstract values.

They are operating decisions.


A One-Page Shepherd Contract

Purpose
A lightweight control document for any agentic workflow.
Defines ownership, quality, escalation, and trust boundaries before scale.


Workflow Overview

Workflow Name:
Business Outcome:
Primary Risk Area: (e.g. financial, legal, reputational, operational)


Accountability (RACI)

Role Name / Function
Responsible (R)  
Accountable (A)  
Consulted (C)  
Informed (I)  

Definition of Ready

A task may only start when all of the following are true.

  • Goal and success criteria are explicit
  • Required input data is available and permitted
  • Tools and permissions are correctly scoped
  • Policy and guardrails are known
  • Escalation conditions are defined

Definition of Done

A task is complete only when all conditions below are met.

  • Output is complete and fit for purpose
  • Sources and assumptions are recorded
  • Actions taken are logged
  • Required parties are notified
  • Rollback path is known and available

What “Good” Looks Like

  • Policy-compliant and factually correct
  • Appropriate tone and audience fit
  • Minimal risk exposure
  • Measurable improvement (time saved, errors reduced, escalations avoided)

What “Bad” Looks Like (Must Escalate)

  • Financial impact above threshold
  • Legal or contractual ambiguity
  • Sensitive data exposure
  • Customer distress or harm
  • Identity, access, or permission changes
  • Any output that is confidently uncertain

Human-in-the-Loop Triggers

  • Money movement
  • External communication under brand
  • Contractual commitments
  • Identity or access changes
  • High uncertainty or low confidence signals

Logging & Audit Requirements

  • Inputs received
  • Decisions made
  • Tools used
  • Actions executed
  • Model and prompt version
  • Timestamp and actor (agent or human)

Rollback & Incident Plan

  • How the action can be reversed
  • Who is notified
  • What is paused or shut off
  • Where the incident is reviewed

Approval
Accountable Owner Sign-off:
Date:

 


Closing thought

If you want to scale agents, do not start by asking how many tasks they can do.

  • Start by asking how many decisions you can safely supervise.

  • Then build shepherds so you do not have to supervise everything.

  • You will move faster, with fewer surprises, and with more trust from your teams.

  • Have fun with your shepherds of agentic sheep.

 


Quotes

Share this post