Author @sairahul1 Dissects the workflow revolution from "Vibe Coding" to "Software Factory": breaking down a single AI conversation into 7 specialized agents: Researcher, Story Writer, Spec Writer, Backend Builder, Frontend Builder, Test Verifier, Implementation Validator, each with a single responsibility, clean context, and strict boundaries.
(Prequel: Can MCP connecting everything plus Web3 become the next wave of AI storytelling a hundredfold?)
(Background supplement: The strongest investment masters work for you! Gathering Buffett, Munger, Cathie Wood… 19 AI Agents to analyze the market)

Table of Contents

Toggle

The problem no one talks about
Turning point: from Vibe Coding to Software Factory
Seven Agents
- Agent 1: Codebase Researcher
- Agent 2: Story Writer
- Agent 3: Spec Writer
- Agent 4: Backend Builder
- Agent 5: Frontend Builder
- Agent 6: Test Verifier
- Agent 7: Implementation Validator
How the entire chain runs
Basics: Before agents can operate, you need this
- CLAUDE.md — Survives in every conversation’s memory
- Context Drift — The silent killer
Results: What truly changes
- Before factory:
- After factory:
- The real transformation:
Build your own version this weekend
- 8-step setup checklist:
Quick reference for the seven agents

I thought I was coding with AI. Turns out, I was just typing faster.

What I want to talk about is the difference — and the system that completely changes everything: the "7 Agent System."

Save this article. It will save you several months.

The problem no one talks about

That seemingly productive, but actually ineffective cycle:

→ Ask Claude to make a feature → It produces code → Something breaks → Paste error message back → It patches → Another part breaks → Ask again

Day 1: This feels like magic.

Day 30: You spend more time supervising AI than writing code yourself.

The same logic appears in three different places. Claude forgets the conventions you set two weeks ago. New features break old ones. Testing is either missing or superficial.

One day you wake up and realize: It’s not AI failing, it’s your workflow failing.

The core issue is structural.

When you type "Help me make this feature" in Claude Code, you’re actually asking an AI conversation to play multiple roles simultaneously:

→ Product Analyst → Architect → Backend Engineer → Frontend Engineer → Tester → Code Reviewer

All at once. In one chaotic conversation.

Wrong assumptions in the plan turn into wrong database models. Wrong models become wrong APIs. Wrong APIs lead to wrong UI.

By the time you notice, errors have spread everywhere.

This is what’s called vibe coding (coding by feel).

It hits a hard ceiling.

Turning point: from Vibe Coding to Software Factory

The real key to changing everything:

A true engineering team doesn’t work in a single large conversation.

Different people have different tasks:

→ Clarify user problems → Think about architecture → Write APIs → Build UI → Consider edge cases → Review

When you shrink all these into one AI conversation, errors quietly accumulate.

The fix is to break work into specialized agents.

Each agent gets:

→ A focused task → Its own clean context window → Only the tools it truly needs → Strict rules about what it "must not touch"

Result: A software factory.

One developer + seven focused agents = a coordinated team.

Below are the seven agents that make this work.

Seven Agents

Agent 1: Codebase Researcher

What’s the biggest mistake developers make when using AI?

Treating "getting code" as the first step.

AI takes your prompt, guesses, fills in gaps, and starts generating. Poor design sneaks in at this moment.

The Researcher corrects this.

Its only job: Review the codebase and explain the current state — before a single line is written.

What it does:

Mark relevant files and their roles
Record existing patterns to follow
Find similar functionalities already built
Flag risks (time zones, multi-tenancy, retry logic)
List which tests need updating

What it cannot do:

Edit files (read-only)
Execute commands that change state
Make assumptions — it should ask questions instead

Tools: Read, Grep, Glob, nothing more.

Rule: Always explore before starting work.

Researcher always runs first.

Agent 2: Story Writer

Most feature failures aren’t because the code is wrong.

It’s because the problem was never clearly defined.

The Story Writer turns rough ideas into a real user story — before any technical decisions are made.

Input:

Your rough feature description
Researcher’s investigation results

Output:

A user story: "As [role], I want [action], so that [result]."
Acceptance criteria: Testable statements — happy path, failure paths, business rules.
Edge cases: Boundaries, retries, multi-tenancy considerations.
Out of scope: What’s explicitly "not going to be done."
Unanswered questions: Things it genuinely doesn’t know — no guesses.

What it cannot do:

Invent business rules
Write any code or technical design
Proceed when truly unclear

Tools: Read, nothing more.

Rule: You must read and approve the story before moving on.

This is the critical human review point 1 — ensuring downstream everything is correct.

Agent 3: Spec Writer

Once the story is approved, the Spec Writer turns it into a technical brief.

This brief is the blueprint all build agents follow.

Input:

Approved user story
Researcher’s investigation results
Your project’s CLAUDE.md rules

Output:

Data model changes (fields, types, migrations)
Background/process flows
API changes (endpoints, request/response formats)
Frontend changes (components, pages, hooks)
Tests needed (success, failure, boundary)
Risks and unresolved issues
Files to be changed

What it cannot do:

Edit files
Invent new infrastructure — must specify explicitly
Skip tenant isolation or timezone considerations
Leave questions unanswered

Tools: Read, Grep, Glob, nothing more.

Rule: This brief is human review point 2.

You read, approve, then files are ready to be touched.

If you see "store ID in memory" — that’s a red flag.

Catch it now. Don’t wait for 10 files to be changed.

Agent 4: Backend Builder

Now it’s time to build.

The Backend Builder implements the "backend half" of features — responsible only for backend.

Input:

Approved technical brief
Researcher’s investigation results
Your project’s CLAUDE.md

It builds:

API routes
Services and business logic
Database access and migrations
Background jobs
Its own unit tests

It cannot do:

Touch React components, pages, or client-side hooks (Agent 5’s job)
Invent dependencies without instructions
Modify files outside scope
Stop without running typecheck, lint, and tests

After completion, it returns a summary: files added or changed, reused helpers or patterns, any CLAUDE.md rules that could be improved.

Tools: Read, Edit, Write, Bash — only within backend folders.

Key point: separation of concerns.

Backend Builder can never accidentally break frontend.

Agent 5: Frontend Builder

Frontend Builder implements the UI part — only responsible for UI.

It first reads the backend agent’s summary.

This is crucial.

It uses the API as per the backend’s output. It does not invent new endpoints.

If the API shape is wrong for the UI, it reports the mismatch — not patching itself.

Input:

Approved technical brief
Researcher’s investigation results
Backend agent’s API summary

It builds:

React components and pages
Client-side hooks and state
Loading and error states
Its own component and unit tests

It cannot do:

Touch services, API routes, workers, or migrations (Agent 4’s job)
Invent endpoints or response formats
Add dependencies without instructions
Stop without typecheck, lint, and tests

Tools: Read, Edit, Write, Bash — only within frontend folders.

Two builders. Two clean contexts. Zero chance one breaks the other.

Agent 6: Test Verifier

Both builders write unit tests for their parts.

That’s not enough.

Test Verifier does one thing: Prove that this feature actually does what the user story says.

It writes "acceptance tests," not unit tests.

Acceptance tests test externally — like a real user experiencing it.

Input:

Approved user story (with all acceptance criteria)
Approved technical brief
Summaries from both builders

Output:

An acceptance test file covering each acceptance criterion
A report: which passed, which failed, which can’t be cleanly covered

What it cannot do:

Modify backend or frontend code
Invent workarounds for untestable criteria
Mark untested criteria as covered

If a test fails: the feature does not meet the story.

It reports "which criterion failed." It does not fix code.

Fixes go back to the correct builder.

Tools: Read, Edit, Write (test files only), Bash.

Rule: Until acceptance tests pass, you don’t have this feature.

Agent 7: Implementation Validator

This agent finds what everyone missed.

It compares current implementation against approved story and brief, reporting gaps.

It never fixes anything. It only tells the truth.

Each run checks:

Unimplemented acceptance criteria
Uncovered failure paths
Security issues: missing permissions, tenant leaks, keys in logs, raw errors leaking
Files changed outside scope
Patterns inconsistent with CLAUDE.md or existing code
Reuse of helpers that should be reused but are duplicated
Timezone or multi-tenancy considerations quietly skipped in brief

Output is always grouped by severity:

Critical — must fix before merge
Important — should fix before merge
Minor — opinion, reviewer’s discretion

Each finding includes file path and line number.

If no issues: it simply says "No issues." It does not invent problems to seem thorough.

Tools: Read, Grep, Glob, nothing more.

This agent is what makes the entire factory trustworthy.

Self-assessment scores are worthless. An auditor who only looks at "what’s on disk," ignoring "how it’s written," is honest.

How the entire chain runs

Complete process — one prompt starts it all:

You open Claude Code, input:

"Help me implement the 'overdue invoice reminder' feature."

Then you don’t need to type more, and this happens:

Step 1: Researcher scans your invoice, payment, email code. Returns relevant files, patterns, risks.

Step 2: Story Writer produces user story and acceptance criteria.

⏸ Pause: You review and approve the story.

Step 3: Spec Writer turns approved story into a technical brief.

⏸ Pause: You review and approve the brief. (Here, catch the "store ID in memory" error.)

Step 4: Backend Builder implements service, API routes, BullMQ jobs, unit tests. Returns: file changes, reused patterns, all tests green.

Step 5: Frontend Builder reads backend API summary, creates admin UI blocks and reminder buttons, writes component tests. All green.

Step 6: Test Verifier writes acceptance tests for six criteria. Reports: 7 pass, 1 fail — manual check for tenant ownership.

Step 7: Validator catches it. Reports with Critical severity, file path, line number.

→ Back to Backend Builder. Fixes it. All 8 acceptance tests green. Validator runs again. Clean.

⏸ Pause: You review and open PR.

Three human review points. Everything else runs itself.

Basic: Before agents can operate, you need this

CLAUDE.md — Survives in every conversation’s memory

Every time you open Claude Code, it starts from "zero memory."

CLAUDE.md fixes this.

It’s a Markdown file at the repo root, auto-loaded at each conversation start.

It’s the home of "permanent project facts":

Your tech stack (Next.js App Router, Node.js, Prisma, BullMQ, Resend)
Your commands (npm run dev, npm test, npx prisma migrate dev)
Architecture rules ("Business logic in services. API routes thin.")
Things not to do ("No cron — use BullMQ. Don’t log raw payment payloads.")
Deep documentation pointers (docs/billing.md, docs/architecture.md)

Keep it within 100–300 lines.

Every time AI makes a surprising mistake, ask: "If CLAUDE.md had a rule, could this have been avoided?"

Add the rule.

Weeks later, your CLAUDE.md becomes a record of "all assumptions AI ever got wrong" — your conversations will improve noticeably.

Context Drift — The silent killer

Most Claude Code conversations don’t fail dramatically.

They drift.

A wrong assumption enters the context. The model keeps stacking on top.

You want Claude to do "subscription management." It designs: User → Subscription.

Later, you remember: subscriptions belong to "company," not "user."

If you just say "No, subscriptions belong to company" — Claude patches it.

Now you have both user.subscriptionId and company.subscriptionId floating around.

Rules:

Typos? Inline fix.
Wrong architecture assumptions? Drop the entire conversation, start over, embed correct assumptions in the first prompt.

A clean conversation with correct mental models always beats a patched one.

Result: what truly changes

Before factory:

Vibe coding cycle: prompt → generate → error → patch → repeat
Context filled with noise
Wrong assumptions turn into broken features
One engineer can only do one thing at a time
Features wait for the right person’s availability

After factory:

Structured chain: Research → Story → Brief → Build → Verify → Confirm
Each agent has a clean context, only what it needs
Wrong assumptions caught at "brief approval" — not after 10 files
One engineer can deliver a complete vertical slice: backend, frontend, tests, verification
The best knowledge lives in agents — not stuck on "someone"

True transformation:

A payments expert creates a payments-integration agent. From that moment, every engineer can deliver billing features. No waiting, no handoff.

Frontend lead’s component patterns live in frontend-builder. DevOps CI checks live in hooks. QA’s edge cases live in test-verifier rules.

Expert knowledge shared as agents. Not stuck on "who’s available."

Build your own this weekend

8-step setup checklist:

Install Claude Code → code.claude.com
Create folder structure:
- .claude/agents/
- .claude/skills/feature-factory/
- .claude/skills/build-with-tests/
- .claude/hooks/
Write your CLAUDE.md (100–300 lines: tech stack, commands, architecture rules, do-not-do list)
Use Claude Code’s /agents command to create 7 agents. Describe each agent’s role. Claude writes files. Review and commit.
Build feature-factory orchestrator skill. Ask Claude to generate it — it will read your 7 agent files and connect the chain.
Build build-with-tests skill. Describe how your team builds: align patterns, write code and tests together, run typecheck at the end.
Add a pre-commit hook. Block commits of .env, .key, .pem, secrets.json. 5 minutes, avoid disasters.
Run a real feature through the full chain. Pick a small one. Observe where it stalls. Add rules. Factory adjusts itself.

Total time: 2–3 hours.

Run several features. After 3–4, the factory knows your codebase.

You’ll spend less time supervising, more time deciding "what’s next."

Seven Agents — Quick Reference

Researcher — scans code before anything is built (read-only)
Story Writer — turns ideas into user stories and acceptance criteria (read-only)
Spec Writer — turns stories into technical briefs (read-only)
Backend Builder — builds APIs, services, jobs, unit tests (backend folder only)
Frontend Builder — builds components, pages, hooks, UI tests (frontend folder only)
Test Verifier — writes acceptance tests for stories (test files only)
Validator — compares implementation against story and brief, reports gaps (read-only)

3 human review points:

→ Approve story → Approve brief → Approve PR

Everything else runs itself.

Most Claude Code developers are still in vibe coding. Prompt → generate → patch → pray.

That’s not wrong. But it hits a ceiling.

The factory doesn’t kick you out of the process. It kicks you out of the "parts where your judgment isn’t needed."

You stay in the parts where "your judgment truly matters":

Is this the right problem? Is this the right design? Is it safe to deploy?

All the middle steps are handled by agents.

That’s the difference between "using AI as a faster keyboard" and "using AI as a coordinated team."

Original author: @sairahul1

View Original

This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.

Reward
like
Comment
Repost
Share

Comment

Add a comment

No comments

Trending Topics
View More
#
WinGoldBarsWithGrowthPoints
1.24M Popularity
#
WTICrudeFallsBelow90Dollars
1.2M Popularity
#
StockTradingChallengeUpTo17000U
208.56K Popularity
#
USIranNegotiationGame
9.35M Popularity
#
TradeCFDWinGold
3.21M Popularity

Pinned

Sitemap

Practical Training: Step-by-Step Guide to Using 7 Agents to Upgrade Vibe Coding to an Expert-Level Development Workflow

The problem no one talks about

Turning point: from Vibe Coding to Software Factory

Seven Agents

Agent 1: Codebase Researcher

Agent 2: Story Writer

Agent 3: Spec Writer

Agent 4: Backend Builder

Agent 5: Frontend Builder

Agent 6: Test Verifier

Agent 7: Implementation Validator

How the entire chain runs

Basic: Before agents can operate, you need this

CLAUDE.md — Survives in every conversation’s memory

Context Drift — The silent killer

Result: what truly changes

Before factory:

After factory:

True transformation:

Build your own this weekend

8-step setup checklist:

Seven Agents — Quick Reference

Trending Topics

WinGoldBarsWithGrowthPoints

WTICrudeFallsBelow90Dollars

StockTradingChallengeUpTo17000U

USIranNegotiationGame

TradeCFDWinGold

Pinned