Draft-only AI agents: how we build agents that can read but never send

The email-triage agent we shipped on Sunday can read every thread in our inbox and can’t send a single one. By design. This isn’t a feature we forgot to wire up. The agent never had the ability in the first place, and we made sure of it at the layer the agent can’t argue with.

Most “guardrails” in AI agent products live in the prompt. The system message says “you are an inbox assistant; do not send emails without approval.” That sentence is asking the model to police itself. It’s a request, not a fence. Bad model day, novel input, a clever prompt injection in a thread the agent is reading, and the request can fail. When the agent has the tool to send, the question isn’t whether it will misfire. It’s when.

So we don’t put the fence in the prompt. We put it one layer down, at the tools the agent is allowed to call at all.

Where the safe boundary actually lives

An agent can only do what it has tools for. That’s not a clever observation, it’s mechanical. The agent loop picks an action, the runtime executes the action, and the action is one of a finite set the system has registered as callable. If a tool isn’t in the set, the agent can’t invoke it. The model can want to. The prompt can plead. The runtime won’t dispatch what doesn’t exist in its allowlist.

This is the layer where the fence belongs. Decide which tools the agent gets at registration time. Make sure the destructive or outbound ones aren’t in the list. The agent now operates in a sandbox enforced by the platform, not by the model’s good behavior.

The implication: an agent that can’t send isn’t an agent that has been told not to send. It’s an agent for which “send” is not in its vocabulary. The distinction matters because the second one is unconditional. The first one degrades under pressure.

What this looks like in our build

Our email-triage agent runs on the Gmail MCP. The MCP exposes a set of tools: things like search_threads, get_thread, list_labels, create_draft, send_message, modify_labels, archive_thread. When we registered the agent as a scheduled task, we approved exactly four of those: search_threads, get_thread, list_labels, create_draft.

We did not approve send_message. We did not approve any label or archive mutation. The agent is read-only on the inbox state, and write-only into the Drafts folder, which is a separate, gated mailbox surface the human reviews and ships from.

The pre-approval happens once, at task creation, in a regular session. We click “Run now” so the platform asks “approve this tool?” the first time each tool is needed. We approve the read tools and the draft-creator. We let the send permission stay unrequested. The agent never sees that capability after that.

If we changed our minds tomorrow and decided we wanted the agent to send under some narrow condition, we’d have to go through approval again. That friction is the feature. Approving an outbound capability is a deliberate act. It should feel deliberate.

Why we extended this beyond email

This isn’t an email pattern. It’s an operating posture, and we apply it to every agent we run.

Our SOW pipeline drafts contracts when a prospect’s deal_stage flips to won. The drafter generates the populated SOW, renders the PDF, and creates a DocuSign envelope. The envelope is created in status created. Never sent. The actual click that puts the SOW in front of the client is a human one.

Our closeout-runbook drafter assembles the post-engagement package when a contract end date approaches. It writes a runbook skeleton, drafts a closeout note, and saves both to local files. It does not send the email. It does not push the runbook to a client repo. Those steps are review-gated.

Our content-engine drafter (the one writing this post) produces blog and LinkedIn drafts under a shared theme each weekday morning. The drafts land at status: review. The downstream distribution automation will only publish when the status flips to scheduled and a future publish_date is on the draft. That status flip is a human approval. We’ve codified the constraint as a hard rule in our operating doctrine: drafts only, never auto-sent, with one narrow carve-out for public broadcast content that’s been explicitly pre-approved and scheduled.

Across all of these, the pattern is identical. The agent does the work that takes humans time and judgment. The agent does not perform the act that travels outside our machine. The boundary between “drafted” and “sent” is the boundary between agent and operator.

What you actually get from draft-only

The first thing to be honest about: draft-only agents are slower than fully autonomous ones. There is a review step. The operator still has to look at the draft and click the button. A vendor selling “fully automated inbox” can promise zero minutes of operator time. We can’t.

We can, however, promise an inbox that gets through most of the read-and-respond cycle without you, with the last slice being the part where judgment lives. A first-pass triage, classification, and draft, written in your voice, sitting in your Drafts folder by 7am, turns the morning email scan from “open every thread cold” into “review and send.” That’s a real shift. It just isn’t the marketing-copy version where the agent does the whole job.

The second thing to be honest about: this only works if the drafts are good enough to mostly ship. If the agent writes drafts you rewrite from scratch every time, you’ve added work, not removed it. The voice calibration, the templates the agent drafts against, the per-recipient context the agent looks up, all of that has to be tight enough that the operator’s role is review, not rewriting.

When those two pieces are in place, draft-only agents are some of the best AI software we ship. They get the operator out of the unloved bottom of the work, the initial triage, the first-pass draft, the finding-the-right-context step, while keeping the judgment work where it belongs.

When draft-only is not the right pattern

There’s work where draft-only doesn’t fit. Internal state-machine moves are fine to automate without review: if a Stripe payment arrives, flip the deal stage. If a calendar event ends, append a “delivered” marker. These are bookkeeping inside our own systems. They don’t travel outside the machine.

Mutations that exit the machine but only touch the agent’s own scratch space are fine too: writing a digest log, updating an internal dashboard, posting to an audit channel only the team can see. The cost of a misfire is low.

The line we hold is at “would a wrong move from this agent embarrass us with a real person or a real counterparty.” Anything past that line is draft-only by default. If we want to relax the rule for a specific case (a noisy internal notification we’d never sweat misfiring), we relax it deliberately and we write down why.

The take

The right default for an AI agent that touches anything outside the local machine is draft-only. Not because every agent is dangerous, but because the cost of one wrong send dwarfs the cost of a click. We’d rather review thirty drafts a week than send one bad email.

When you’re buying an AI agent, the diagnostic is direct. Ask the vendor what tools the agent has access to in production. Ask whether the agent technically can send, post, charge, or fire without a human. If the answer is yes, ask what stops it. If the answer is “the prompt tells it not to,” walk away.

If you’re building one, draw the same line. Decide at the tool layer what the agent is allowed to reach. Make the destructive things unreachable. The agent gets dumber along one axis and dramatically safer along another. That’s the trade we’ll take every time.

Draft-only AI agents: how we build agents that can read but never send

Draft-only AI agents: how we build agents that can read but never send

Where the safe boundary actually lives

What this looks like in our build

Why we extended this beyond email

What you actually get from draft-only

When draft-only is not the right pattern

The take

Topics

Ready to automate your business?