Why Compound Engineering Is My New Default for Claude Code

I compared four AI coding frameworks a week ago. GSD, SpecKit, BMAD, Compound Engineering. I gave GSD the most time because it felt closest to how I actually work. Structured enough to keep context fresh, light enough not to slow me down.

Then I installed the Compound Engineering plugin and ran it for a week on the same codebase. By day three, I stopped reaching for GSD entirely.

The difference isn’t in the command set. It’s in what happens after you ship a feature.

GSD Felt Fast Until It Didn’t

GSD is genuinely good at what it does. The /gsd:discuss-phase, /gsd:plan-phase, /gsd:execute-phase loop is clean. Context engineering keeps the AI coherent across sessions. Multi-agent execution via worktrees is powerful. I covered all of this in the previous comparison.

But here’s what I noticed after five days of real use: every new milestone starts from roughly the same place. GSD does a great job of grinding through one milestone with high quality. Then you hit /gsd:complete-milestone, archive, and start the next one. The new milestone inherits your code but not much else.

The patterns I figured out on Monday didn’t make Tuesday’s work easier in any structural way. My code improved, sure. But the environment around the code stayed static. Same planning overhead. Same verification steps. Same agent context.

The problem isn’t GSD’s execution speed. It’s that the system doesn’t compound.

What Compound Engineering Does Differently

The Compound Engineering plugin from Every wraps a four-step loop into Claude Code:

Plan → Work → Review → Compound

The first three steps look familiar. Plan your feature, let agents build it, review the output. Every framework does some version of this.

The fourth step is what changes everything.

/ce:compound asks a specific question after each piece of work: “What did we just learn that should make future work easier?” Then it codifies that learning. Updated documentation. New test patterns. Refined agent skills. Solved-problem records that agents consult during future planning.

Here’s what a typical session looks like:

Step	Command	What Happens
Brainstorm	`/ce:brainstorm`	Explore requirements, surface edge cases
Plan	`/ce:plan`	Structured implementation plan with dependencies
Work	`/ce:work`	Agents execute with parallel task management
Review	`/ce:review`	Multi-persona code review (security, performance, correctness)
Compound	`/ce:compound`	Codify learnings, update patterns, document solutions

That last row is the one that compounds. After three loops through this cycle, my agents started producing better first drafts because they had access to patterns and solutions from previous rounds.

The Speed Difference Is Nonlinear

Day one with Compound Engineering felt slower than GSD. More commands. More moving parts. The /ce:brainstorm step alone surfaces more questions than GSD’s /gsd:discuss-phase. Setup overhead was real.

By day three, the curve flipped.

Planning drew on accumulated learnings from previous features. Review used multi-persona agents (security reviewer, performance reviewer, correctness reviewer) that caught issues I would have found manually in GSD’s verify step. And the compounded knowledge meant agents produced code that already followed the patterns established in earlier rounds.

Here’s a rough sense of how my time allocation shifted:

Day 1 (Compound Engineering):
  Planning:    35%
  Execution:   30%
  Review:      25%
  Compounding: 10%

Day 5 (Compound Engineering):
  Planning:    15%
  Execution:   40%
  Review:      25%
  Compounding: 20%

Planning shrank because agents had better context. Execution grew because agents were more autonomous. Compounding grew because I saw the payoff and invested more.

With GSD, my time allocation stayed roughly flat across the week. Not bad. Just linear.

The Multi-Agent Review Is Underrated

One thing that surprised me: Compound Engineering’s review system is significantly more thorough than what I was doing with GSD.

When you run /ce:review, it doesn’t just do a single code review pass. It spawns specialized review personas:

Correctness reviewer: logic errors, edge cases, state management bugs
Security reviewer: auth, input validation, injection vectors
Performance reviewer: query patterns, loop complexity, caching
Testing reviewer: coverage gaps, weak assertions, brittle tests
Maintainability reviewer: premature abstraction, dead code, naming

Each persona reviews the diff independently and reports back. In GSD, I was relying on /gsd:verify-work which checks against REQUIREMENTS.md. Useful, but narrower.

Why this matters in practice: The security reviewer caught an unvalidated input path in my API handler that I would have shipped. The performance reviewer flagged an N+1 query that only showed up under load. These aren’t hypothetical benefits. They’re bugs that would have hit production.

Where GSD Still Wins

This isn’t a “GSD is bad” post. GSD excels in specific scenarios:

Greenfield sprints with clear scope. If you know exactly what you’re building and need to get from zero to working code fast, GSD’s tighter loop gets you there with less overhead. The discuss/plan/execute/verify cycle is ruthlessly efficient when scope is locked.

One-off projects you won’t revisit. If you’re building a prototype, a weekend hack, or a tool you’ll use once, investing in compounding doesn’t pay off. GSD’s milestone-and-archive model is exactly right for disposable work.

Teams new to AI-assisted development. GSD’s smaller command surface and clearer mental model make it easier to adopt. /gsd:next literally tells you what to do next. That’s valuable when you’re still building intuition for how to work with agents.

When to Choose What

Choose Compound Engineering if you:

Work in the same codebase for weeks or months
Build features that share patterns and infrastructure
Want your agents to get better at your specific project over time
Value thorough, multi-angle code review
Are willing to invest 10-15% of your time in system improvement

Choose GSD if you:

Need to smash through a well-defined milestone fast
Are working on a greenfield or throwaway project
Prefer a smaller, more predictable command surface
Want the simplest possible AI coding workflow that still prevents context rot

Use both if: You use GSD’s execution discipline to stay focused on high-leverage compound engineering tasks. Pick 2-3 things per day from your compound backlog, use GSD-style focus to execute them, then run the compound step to lock in the gains.

The Bottom Line

GSD is a high-throughput project runner. Compound Engineering is an engineering operating system. Both work. The question is whether you’re optimizing for today’s task or next month’s velocity.

After a week with each, my answer is clear. GSD shipped features. Compound Engineering shipped features and made the next features easier to ship. That compounding effect is the whole game when you’re living in the same codebase for months.

The irony is that Compound Engineering makes you feel slower at first. More steps, more questions, more overhead. But by day three, your system has absorbed enough patterns that agents produce better output with less guidance. That’s when it clicks: you’re not just coding faster. Your entire environment is coding faster.

Experimenting with AI coding frameworks? I’d love to hear what’s working for your team. Reach out on LinkedIn.