evidence-based coding

Customer-Evidence-Driven Coding: VoC in Specs & PRs (2026)

Trace every spec and PR back to verbatim customer signals. Learn how evidence-driven coding and ZeroShot bring voice of customer into AI coding workflows.

AI coding agents have solved the speed problem. They have not solved the correctness problem. In 2026, the question is no longer can we build it fast — it's are we building what the evidence says customers actually need. This guide introduces customer-evidence-driven coding: the practice of tracing every spec and pull request back to verbatim customer signals. It also explains how BuildBetter closes the gap between where customer evidence lives (calls, tickets, surveys) and where engineers and agents actually work (specs and PRs) — through BuildBetter's customer-feedback platform and its coding context layer, ZeroShot.

The Speed Trap: Why Faster Agents Build the Wrong Thing Faster

Speed without customer evidence simply multiplies the cost of building the wrong thing. AI coding agents — Cursor, Claude Code, Codex, Devin — have collapsed time-to-code. Controlled studies put productivity gains as high as 55% on individual tasks. But none of that addresses whether the task was the right one to begin with.

Consider a familiar scenario: a staff engineer ships a feature in two days that flawlessly implements a spec. The agent did everything right. The code is clean, tested, and merged. The only problem is that the spec itself misread what customers were asking for. Two days of fast, confident execution produced two days of wasted work — plus the rework cost, which research consistently shows rises by orders of magnitude the later a requirements defect is caught.

This isn't a fringe risk. For over a decade, product research has found that roughly two-thirds of shipped features deliver little to no measurable value. Speed amplifies that waste. When the agent works from a bad prompt, more autonomy just ships the wrong thing faster and more confidently.

The missing layer in 2026 isn't more agent autonomy. It's customer evidence flowing into the specs and PRs that agents work from.

The bottleneck has moved upstream — from execution to requirement correctness. That's the gap evidence-driven coding closes.

What Is Customer-Evidence-Driven Coding?

Customer-evidence-driven coding is the practice of connecting real customer signals — sales calls, support tickets, product feedback, churn interviews — directly to the specs, plans, and pull requests that engineers and AI agents work from. Every requirement and acceptance criterion should trace back to a verbatim customer signal, not a stakeholder's memory of one.

The status quo is what we'll call requirement laundering. A customer says something specific on a call. Customer Success summarizes it. A PM paraphrases the summary into a ticket. An engineer (or an agent) turns the ticket into a prompt. By the time code gets written, the customer's actual language — their intent, their constraints, their words — has disappeared through three layers of paraphrase.

Evidence-driven vs. data-driven

These are not the same thing. Data-driven development relies on quantitative metrics and analytics — usage counts, funnel drop-off, dashboards. Evidence-driven development preserves the qualitative, verbatim voice of the customer. It keeps the customer's own words and intent intact so that decisions are made against real demand rather than abstracted numbers.

It's a discipline first, a tool second

Senior engineers should treat the spec as a chain of custody for customer intent: if an acceptance criterion can't cite a verbatim signal, it's an assumption, not a requirement. That mindset is the discipline. But at scale, the discipline is nearly impossible to enforce by hand — no engineer can manually search call transcripts per pull request. That's why evidence-driven coding needs a system to surface the right signals automatically.

Where Customer Evidence Lives — and Why It Never Reaches the PR

Customer evidence is abundant — it just lives nowhere near the code. Map the typical sources and the problem becomes obvious:

Call recordings — sales and success conversations full of specific feature requests and objections.
Support tickets — Zendesk, Intercom, and similar tools logging the exact problems customers hit.
Feature-request boards — explicit asks, often with vote counts and context.
NPS and survey verbatims — open-text feedback in customers' own words.
Churn interviews — the highest-signal evidence of all: why customers left.

All of this sits inside CS and PM tools. Engineers and AI agents, meanwhile, live in the IDE and the PR. There is no pipe between them. The handoff is a wiki link, a Slack message, or a one-line ticket that's already been laundered of its original detail.

Worse, most AI coding agents have zero awareness of customer context. They optimize whatever prompt they're given — including a bad one. The agent can't tell the difference between a requirement grounded in five customer calls and a requirement someone invented in a planning meeting.

This is where BuildBetter comes in. BuildBetter is the customer-feedback platform that captures and structures these signals — every call, ticket, survey, and Slack thread — and applies severity, business impact, and your taxonomy to each one. It turns the scattered, unstructured voice of the customer into structured signals. The remaining question is the bridge problem: how do those structured signals actually get into the spec and the code review?

The Workflow: Pulling Customer Evidence Into Specs and PRs

The bridge between customer evidence and code is ZeroShot — the BuildBetter CLI (bb) at tryzeroshot.com. ZeroShot is a coding context layer that pulls structured signals from BuildBetter into the exact stages where requirements get defined and verified. Here's the workflow.

Step 1 — Spec stage: `/bb-specify`

Instead of starting a spec from a vague ticket, you run /bb-specify. The spec is seeded with customer evidence pulled directly from BuildBetter signals — with citations back to the source call or ticket. Each acceptance criterion can be tied to a verbatim signal, so the spec begins grounded in real demand rather than assumption.

Step 2 — Plan stage: `/bb-plan`

/bb-plan ties implementation decisions to the evidence that justifies them. When the team faces a trade-off — scope, edge cases, performance targets — that trade-off is made against documented customer demand, not a guess about what "most users probably want."

Step 3 — PR review: `/bb-review`

/bb-review surfaces the relevant customer evidence inline during code review. Reviewers see the diff next to the customer signals it's supposed to address. They can verify, in seconds, that the implementation actually solves what customers asked for — instead of approving code against a stale ticket.

Step 4 — Traceability

The result is an auditable thread: customer language → spec → plan → PR. Every requirement carries its chain of custody. After the fact, anyone can trace a shipped feature back to the exact conversations that justified it.

Before and after

Picture a ticket that reads simply: "improve export." With /bb-specify, that becomes a spec citing five customer calls requesting CSV exports of datasets over one million rows — with the verbatim quotes attached. The plan justifies streaming the export rather than buffering it, against that documented constraint. And the PR is reviewed against exactly that evidence, not against the four-word ticket.

A clarification on products: ZeroShot (the BuildBetter CLI, bb, at tryzeroshot.com) is the coding context layer. BuildBetter.ai is the separate customer-feedback product whose signals ZeroShot pulls in. The two are complementary — one captures and structures the evidence, the other delivers it into your specs and PRs.

ZeroShot as the Context Layer — Not Another Agent

ZeroShot does not replace your AI coding agent — it sits underneath whatever agents your team already uses. Cursor, Claude Code, Codex, Copilot, Gemini CLI, Windsurf, Amazon Q — keep them all. ZeroShot is the memory and skills layer that makes them work together with your whole team and your customer evidence.

The right mental model is "context layer," not "another agent." Adding autonomy to an agent working from a bad prompt only ships the wrong thing faster. ZeroShot fixes the input, not the engine.

Three layers nobody else combines

Cross-agent session memory — every coding session is saved, indexed, and shareable across teammates and across agents.
Team-conventional skills — your actual playbook, encoded and portable.
Customer evidence from BuildBetter — verbatim signals pulled into specs, plans, and reviews.

Cross-teammate session resume

bb agent-sessions resume picks up any teammate's session, in any agent. A teammate working in Claude Code can hand off to someone on Cursor without re-explaining the context. As teams stop being single-agent shops, this kind of context handoff is the new collaboration friction — and ZeroShot eliminates it.

Skills encode your team's conventions

/bb-specify, /bb-plan, and /bb-review carry your actual playbook into every PR. They're built on the open AGENTS.md standard via BB-Skills, which means the discipline travels with the work instead of living in a wiki nobody reads. Encoding conventions as skills scales discipline without policing.

BB-Skills is open source (github.com/buildbetter-app/BB-Skills) and privacy-first — no data leaves your repo without consent, and there's no vendor lock-in. ZeroShot is already used by Brex, Rappi, PostHog, AppFolio, Clay, Lufthansa, Procore, and Macmillan.

How the Customer-Evidence Layer Compares

ZeroShot is the only layer that combines cross-agent memory, encoded team skills, and customer evidence. The table below is deliberately fair: code-generation agents excel at writing code, and several context tools provide memory — but none integrate a customer-evidence layer.

Tool	Primary role	Cross-agent support	Team-shared session memory	Encoded team conventions/skills	Customer-evidence integration
ZeroShot (BuildBetter CLI)	Context layer	Yes	Yes	Yes (BB-Skills, AGENTS.md)	Yes — verbatim signals from BuildBetter
Cursor	Code generation	No	No	Limited (rules files)	No
Claude Code	Code generation	No	No	Limited (CLAUDE.md)	No
Devin	Autonomous agent	No	No	Limited	No
Cody	Code assistant + context	No	No	No	No
Augment	Context/memory	Partial	Partial	No	No
ContextPool	Context/memory	Partial	Partial	No	No
Graphiti	Memory graph	Partial	Partial	No	No

The honest takeaway: agents like Cursor, Claude Code, and Devin are excellent at generating code. Context tools like ContextPool, Graphiti, Recallium, and Augment provide memory and context. But the customer-evidence column is unique to ZeroShot — it's the only layer that brings real signals from BuildBetter into the coding workflow. And it's complementary: you keep your agent and add the evidence, memory, and skills layer on top.

Adopting Evidence-Driven Coding on Your Team

Start with one high-stakes feature where the cost of building the wrong thing is highest. Billing, data export, and onboarding are ideal first candidates — areas where a misread requirement is expensive to revert and visible to customers.

A practical rollout

Wire BuildBetter signals into your spec template. Use /bb-specify so every new spec starts seeded with cited customer evidence. Require at least one customer citation per acceptance criterion — if it can't cite a signal, it's an assumption.
Add /bb-review to the PR checklist. Make it standard that reviewers see the relevant customer evidence next to the diff before approving.
Encode your conventions as BB-Skills. Capture your team's real spec and review playbook so the discipline scales across every PR without anyone playing policeman.
Use cross-agent memory. Adopt bb agent-sessions resume so context survives handoffs between teammates and agents.

What to measure

Track two things. First, the share of features shipped with versus without evidence linkage — this tells you whether the discipline is actually taking hold. Second, monitor rework and revert rates. Rework is a leading indicator of poor feature-level fit; when requirements don't trace to real demand, revert rates climb. Falling rework on evidence-linked features is the clearest signal that customer-evidence-driven coding is working.

Frequently Asked Questions

What is evidence-based (customer-evidence-driven) coding?

It's a practice of tracing every spec and PR requirement back to verbatim customer signals — sales calls, support tickets, feature requests, churn interviews — rather than paraphrased assumptions or a stakeholder's memory. Each acceptance criterion should cite a real customer source.

How does ZeroShot bring voice of customer into code?

The BuildBetter CLI (bb) pulls structured customer signals from BuildBetter.ai into three skills — /bb-specify, /bb-plan, and /bb-review — so specs are seeded with cited evidence, implementation trade-offs are justified against real demand, and PR reviews surface the relevant customer evidence inline next to the diff.

Is ZeroShot an AI coding agent?

No. ZeroShot is a context layer that sits underneath your existing agents (Cursor, Claude Code, Codex, Copilot, Gemini CLI, Windsurf, Amazon Q, and more). It adds cross-agent session memory, team-shared skills, and customer evidence — it doesn't generate code or replace your agent.

How is this different from ContextPool, Graphiti, Recallium, or Augment?

Those tools provide context or memory for agents, but none integrate a customer-evidence layer from sources like BuildBetter.ai. ZeroShot is unique in combining cross-agent memory, encoded team skills, AND customer evidence in one layer.

What's the relationship between ZeroShot and BuildBetter.ai?

ZeroShot is the coding CLI (bb) at tryzeroshot.com — the context layer for your IDE and PRs. BuildBetter.ai is a separate customer-feedback product that captures and structures customer signals. ZeroShot consumes those signals; the two are complementary, not the same product.

Does it lock us into one agent or vendor?

No. ZeroShot is open source (BB-Skills on GitHub), built on the AGENTS.md standard, and works across agents and teammates. No data leaves your repo without consent, and there's no vendor lock-in.

Stop Shipping the Wrong Thing Faster

AI agents made code cheap. That makes building the right thing the only competitive advantage left. Customer-evidence-driven coding — powered by BuildBetter's customer-feedback platform and the ZeroShot context layer — ensures every spec, plan, and PR traces back to what customers actually said. Connect the evidence to the code, and rework stops being a tax you pay after launch.

Make churn optional. Book a demo.

Customer-Evidence-Driven Coding: VoC in Specs & PRs (2026)

The Speed Trap: Why Faster Agents Build the Wrong Thing Faster