Build entire projects from a single requirements file. This boilerplate doesn't just provide pre-configured agents β it autonomously generates the skills, agents, hooks, and code your project needs using the Spec-Kit-Plus workflow.
π¦ What's Included Out of the Box:
- β
Spec-Kit-Plus Pre-Installed - Complete
.claude/and.specify/framework ready to use - β 69 Production-Ready Components - 15 skills, 13 agents, 29 commands, 9 rules, 7 scripts, 6 templates
- β Clean Slate - No session data, logs, or artifacts - fresh start for every user
- β Multi-User Collaboration - Intelligent TODO merging, contributor tracking, session recovery
- β
Session Isolation -
.gitignoreconfigured to exclude all session-specific files - β Quality Gates - Component utilization enforcement, workflow validation, A-F grading
- β Complete Documentation - 2,343 lines of comprehensive guides
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β β
β YOU WRITE: BOILERPLATE GENERATES: β
β βββββββββββ ββββββββββββββββββββββ β
β β
β requirements/my-app.md βββ β Skills for YOUR tech stack β
β β Agents for YOUR project needs β
β β Hooks for YOUR workflow β
β β Complete project with tests β
β β 80%+ code coverage β
β β Security-reviewed code β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
One command. Full project. No manual setup.
claude "/sp.autonomous requirements/my-app.md"- How It Works
- Quick Start
- The Spec-Kit-Plus Workflow
- Writing Requirements
- Pre-Loaded Components
- Manual Mode (Optional)
- Understanding the Structure
- Customization
- Troubleshooting
- FAQ
Create a simple markdown file describing what you want to build:
# My E-Commerce API
## Features
- User authentication (JWT)
- Product catalog with search
- Shopping cart
- Order processing
## Technical
- Backend: Node.js + Express
- Database: PostgreSQL + Prisma
- Testing: Jestclaude "/sp.autonomous requirements/my-app.md"Note: Spec-Kit-Plus is pre-installed in this boilerplate. The
.claude/and.specify/directories contain all necessary templates, scripts, and configurations. No initialization required!
The autonomous workflow:
- Verifies Spec-Kit-Plus installation (pre-installed in this boilerplate)
- Creates feature branch automatically (if on main/master)
- Analyzes your requirements file
- Detects technologies (Node.js, Express, PostgreSQL, Prisma, Jest)
- Generates custom skills for your stack (if needed)
- Generates specialized agents for your project (if needed)
- Generates quality hooks for your workflow (if needed)
- Creates specification, plan, and task breakdown
- Implements each feature using TDD
- Reviews code for security and quality
- Delivers complete project with tests
Result: A production-ready project with 80%+ test coverage, security-reviewed code, proper documentation, and all work done on a feature branch ready for PR.
When you start a new autonomous build:
- Checks if you're on
mainormasterbranch - Automatically creates
feature/{project-name}branch - Switches to the new branch for all work
- Sets up remote tracking if origin exists
- Preserves your main/master branch untouched
This ensures best practices: all development happens on feature branches, keeping your main branch clean.
Note for new users: This boilerplate starts with a clean slate. Your first session will create the initial TODO state, which then persists across future sessions.
Problem Solved: TODOs are now saved across conversation sessions AND support multiple developers working on the same project!
When you start a new conversation (days/weeks later):
- Run:
bash .claude/scripts/resume-work.sh - See all saved TODOs from previous sessions (yours and your teammates')
- Ask Claude to restore them to your new session
Multi-User Support:
- Intelligent Merge: When multiple people work on the project, their TODOs are merged automatically
- Conflict Resolution: Status priority (completed > in_progress > pending)
- Contributor Tracking: See who created and contributed to each TODO
- History Snapshots: Every session saves a historical snapshot
How It Works:
- TODOs automatically save to
.specify/todos.jsonwhen session ends - Merges with existing TODOs from other sessions
- Tracks contributors and maintains history in
.specify/todo-history/ - No more lost context when starting fresh conversations
Quick Commands:
# Resume work and see saved TODOs (with collaboration info)
bash .claude/scripts/resume-work.sh
# See who contributed what
python3 .claude/scripts/sync-todos.py contributors
# View historical snapshots
python3 .claude/scripts/sync-todos.py history
# Check TODO status
python3 .claude/scripts/sync-todos.py status
# Manually save TODOs
python3 .claude/scripts/sync-todos.py saveExample Multi-User Flow:
Developer A β Creates 5 TODOs β Session ends β Auto-saves
Developer B β Resumes work β Sees A's TODOs β Adds 3 more β Marks 2 as completed β Auto-merges
Developer A β Returns β Sees combined work from both sessions
See .claude/docs/SESSION-RECOVERY.md for full documentation.
-
Clone this boilerplate:
git clone https://github.com/your-username/claude-code-autonomous-agent-workflow.git cd claude-code-autonomous-agent-workflow -
Your first session will be clean - no pre-existing TODOs or session state
.specify/todos.jsonwill be created when you first use TODOs- Historical snapshots save to
.specify/todo-history/automatically - See
.specify/todos.example.jsonfor the data structure - Session Isolation: All session-specific files are excluded by
.gitignore - You start fresh with no inherited session data
-
Start working:
# Create your requirements file cp requirements/example.md requirements/my-app.md # Edit requirements/my-app.md # Run autonomous build claude "/sp.autonomous requirements/my-app.md"
# Claude Code CLI
claude --version
# Node.js 18+ (if building Node.js projects)
node --version
# Python 3.8+ (for TODO sync scripts)
python3 --version
# Git
git --version# Clone the boilerplate
git clone https://github.com/your-username/autonomous-agent-boilerplate.git my-project
cd my-project
# Start Claude Code
claude# Create requirements file
mkdir requirements
cat > requirements/my-app.md << 'EOF'
# My Todo App
## Features
- User registration and login
- Create, edit, delete todos
- Mark todos as complete
## Technical
- Frontend: React
- Backend: Express
- Database: SQLite
EOF
# Run autonomous mode
claude "/sp.autonomous requirements/my-app.md"
# Watch it build your entire project!PREREQUISITE: Spec-Kit-Plus must be pre-installed (.claude/ and .specify/ directories with templates and scripts).
When you run /sp.autonomous, this workflow executes:
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β SPEC-KIT-PLUS WORKFLOW β
β (Assumes Pre-Installed Framework) β
β β
β VERIFY β ANALYZE PROJECT β ANALYZE REQUIREMENTS β GAP ANALYSIS β
β β β
β GENERATE β TEST β VERIFY β
β β β
β CONSTITUTION (ONE) β
β β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β SIMPLE (1-3 features): COMPLEX (4+ features): β β
β β SPEC β PLAN β TASKS β For EACH feature: β β
β β IMPLEMENT β QA SPEC β PLAN β TASKS β IMPLEMENT β β β
β β UNIT TESTS β INTER-FEATURE TESTS β β
β β β β β
β β INTEGRATION QA (All features) β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β β
β DELIVER β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
The workflow automatically detects project complexity:
| Mode | Feature Count | Workflow |
|---|---|---|
| SIMPLE | 1-3 features | Single spec/plan/tasks cycle |
| COMPLEX | 4+ features | Per-feature iteration with inter-feature testing |
COMPLEX Mode ensures:
- ONE constitution for the whole project
- Each feature gets its own spec β plan β tasks β implement cycle
- Unit tests run after each feature
- Inter-feature regression tests after feature 2+
- Full integration testing at the end
| Phase | What Happens | Output |
|---|---|---|
| 1. INIT | Create .specify/ and .claude/ directories, setup git |
Project structure |
| 2. ANALYZE PROJECT | Inventory existing skills, agents, hooks | project-analysis.json |
| 3. ANALYZE REQUIREMENTS | Parse requirements file, detect technologies | requirements-analysis.json |
| 4. GAP ANALYSIS | Compare required vs existing, identify gaps | gap-analysis.json |
| 5. GENERATE | Create missing skills, agents, hooks | Custom infrastructure |
| 6. TEST | Validate components execute without errors | Functional test report |
| 6.5. QUALITY VALIDATION | Score components against quality criteria | Quality validation report |
| 7. CONSTITUTION | Define project rules and standards | .specify/constitution.md |
| 7.5. FEATURE BREAKDOWN | (COMPLEX only) Break project into features | Feature list with dependencies |
| 8-10. SPEC/PLAN/TASKS | Per-feature (COMPLEX) or whole project (SIMPLE) | .specify/spec.md, plan.md, tasks.md |
| 11. IMPLEMENT | Build using TDD (write tests first) | Source code + unit tests |
| 11.5. FEATURE QA | Verify feature's unit tests pass | Test report |
| 11.6. INTER-FEATURE TESTS | (COMPLEX, 2+ features) Run ALL unit tests | Regression check |
| 12. INTEGRATION QA | Full test suite: unit + integration + E2E | Complete quality report |
| 13. DELIVER | Commit, generate final report | Complete project |
The workflow enforces a comprehensive testing strategy:
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β TESTING PYRAMID β
β β
β Phase 11 β Feature N unit tests (TDD - write first, then implement) β
β Phase 11.5 β Feature N unit tests (verify implementation passes) β
β Phase 11.6 β ALL unit tests (Feature 1 β N) - catch regressions β
β Phase 12 β ALL unit + integration + E2E tests β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Coverage Requirements:
- 80% minimum for all code
- 100% required for financial, auth, and security code
- All features must pass inter-feature regression tests
your-project/
β
βββ .specify/ # Spec-Kit-Plus artifacts
β βββ project-analysis.json # Analysis of existing project
β βββ requirements-analysis.json # Parsed requirements
β βββ gap-analysis.json # Missing skills/agents/hooks
β βββ constitution.md # Project rules and standards
β βββ spec.md # Detailed specification
β βββ plan.md # Implementation plan
β βββ data-model.md # Database schema
β βββ tasks.md # Task checklist [X] marked
β
βββ .claude/
β βββ skills/ # GENERATED for your tech stack
β β βββ express-patterns/ # (if Express detected)
β β βββ prisma-patterns/ # (if Prisma detected)
β β βββ react-patterns/ # (if React detected)
β β
β βββ agents/ # GENERATED for your project
β β βββ api-builder.md # (if API project)
β β βββ frontend-builder.md # (if frontend project)
β β
β βββ hooks/ # GENERATED for your workflow
β β βββ pre-commit.sh
β β βββ quality-gate.py
β β
β βββ logs/autonomous.log # Build log
β βββ build-reports/ # Final report
β
βββ src/ # YOUR PROJECT CODE
βββ (generated source files)
βββ (generated test files)
The workflow is completely self-enforcing with zero human intervention required during execution.
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β AUTONOMOUS ENFORCEMENT ARCHITECTURE β
β β
β βββββββββββββββ β
β β START β β
β ββββββββ¬βββββββ β
β βΌ β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β PHASE 0: PRE-CHECK (Always runs first) β β
β β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β
β β β β’ Invoke workflow-validator skill β β β
β β β β’ Check all phase artifacts β β β
β β β β’ Detect current state (which phase completed) β β β
β β β β’ Decision: FRESH START or RESUME β β β
β β β β’ Fix any skipped phases (violations) β β β
β β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β βΌ β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β PHASE N: Execute Phase β β
β β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β
β β β β’ Run phase logic β β β
β β β β’ Create phase artifact β β β
β β β β’ Log progress β β β
β β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β βΌ β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β AUTO-VALIDATE (Runs after EVERY phase) β β
β β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β
β β β β’ Check artifact exists β β β
β β β β’ Validate content integrity β β β
β β β β β β β
β β β ββββββ΄βββββ β β β
β β β βΌ βΌ β β β
β β β PASS FAIL β β β
β β β β β β β β
β β β β ββββββ΄βββββββββββββββββββββ β β β
β β β β β SELF-HEAL (max 3x) β β β β
β β β β β β’ Re-run phase β β β β
β β β β β β’ Check again β β β β
β β β β β β’ If still fail: STOP β β β β
β β β β βββββββββββββββββββββββββββ β β β
β β β βΌ β β β
β β β Proceed to Phase N+1 β β β
β β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β βΌ β
β βββββββββββββββ β
β β COMPLETE β β
β βββββββββββββββ β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
| Feature | How It Works |
|---|---|
| Auto-Detection | Phase 0 checks all artifacts to know current state |
| Smart Resume | If interrupted, resumes from last completed phase |
| Self-Healing | Failed phases retry automatically (max 3 attempts) |
| Violation Detection | Skipped phases are detected and executed |
| Zero Intervention | No human input needed during execution |
| Quality Gate Teacher | Grades each phase A/B/C/D/F with APPROVED/REJECTED |
Every phase is validated by the Quality Gate Teacher before proceeding:
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β QUALITY GATE TEACHER β
β β
β βββββββββββββββ βββββββββββββββ βββββββββββββββ β
β β Phase N β ββββΆ β TEACHER β ββββΆ β Grade + β β
β β Output β β Evaluate β β Decision β β
β βββββββββββββββ βββββββββββββββ βββββββββββββββ β
β β β
β βββββββββββββββββββββββββββββΌβββββββββββββββββββ β
β βΌ βΌ βΌ β
β A (90-100%) B/C (70-89%) D/F (<70%) β
β APPROVED APPROVED REJECTED β
β Continue Continue Self-Heal β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Grading Criteria (varies by phase):
- Constitution: Clarity, completeness, enforceability
- Spec: Feature coverage, acceptance criteria, technical accuracy
- Plan: Architecture quality, risk assessment, dependency mapping
- Tasks: Granularity, skill mapping, dependency order
- Implementation: Code quality, test coverage, security
- Testing: Pass rate, coverage percentage, regression status
Generated skills, agents, and hooks are validated for production-readiness before use:
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β COMPONENT QUALITY VALIDATION β
β β
β WHO validates? β component-quality-validator skill β
β HOW validated? β Automated quality criteria scoring β
β β
β βββββββββββββββ βββββββββββββββ βββββββββββββββ β
β β SKILL β β AGENT β β HOOK β β
β βββββββββββββββ€ βββββββββββββββ€ βββββββββββββββ€ β
β β β Triggers β β β Model fit β β β JSON validβ β
β β β Workflow β β β Min tools β β β Bash validβ β
β β β Templates β β β Clear ins β β β Targeted β β
β β β Validationβ β β Fail plan β β β No conflictβ β
β βββββββββββββββ βββββββββββββββ βββββββββββββββ β
β β β β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β SCORE: 0-100 β GRADE: A/B/C/D/F β APPROVED or REJECTED β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Rejected Components Regeneration:
- Attempt 1: Apply specific fixes from validation report
- Attempt 2: Simplify scope, focus on core functionality
- Attempt 3: Use template from similar working component
- Attempt 4+: Mark MANUAL_REQUIRED, continue with others
Each phase creates a specific artifact. The validator checks these to determine state:
| Phase | Artifact | Detection Command |
|---|---|---|
| 1. INIT | .specify/ directory |
[ -d ".specify" ] |
| 2. ANALYZE PROJECT | project-analysis.json |
[ -f ".specify/project-analysis.json" ] |
| 3. ANALYZE REQUIREMENTS | requirements-analysis.json |
[ -f ".specify/requirements-analysis.json" ] |
| 4. GAP ANALYSIS | gap-analysis.json |
[ -f ".specify/gap-analysis.json" ] |
| 5. GENERATE | New skills created | Skill count > baseline |
| 6. TEST | Validation logs | grep "validated" logs |
| 7. CONSTITUTION | constitution.md |
[ -f ".specify/constitution.md" ] |
| 7.5 FEATURE BREAKDOWN | features.json |
[ -f ".specify/features.json" ] (COMPLEX only) |
| 8. SPEC | spec.md or features/N/spec.md |
Spec file exists |
| 9. PLAN | plan.md or features/N/plan.md |
Plan file exists |
| 10. TASKS | tasks.md or features/N/tasks.md |
Tasks file exists |
| 11. IMPLEMENT | Tasks marked [X] |
grep -c "\[X\]" tasks.md |
| 11.5 FEATURE QA | Unit tests pass | Test exit code 0 |
| 11.6 INTER-FEATURE | All unit tests pass | Combined test exit code 0 |
| 12. INTEGRATION QA | Full test suite pass | Unit + Integration + E2E pass |
| 13. DELIVER | Git commit | Commit message contains "autonomous" |
Check workflow state anytime:
# Quick status check - see which phase you're at
claude "/q-status"
# Full validation - check for violations
claude "/q-validate"
# Reset workflow - start fresh
claude "/q-reset"Example /q-status output:
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β WORKFLOW STATUS REPORT β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ£
β [β] 1. INIT β
β [β] 2. ANALYZE PROJECT β
β [β] 3. ANALYZE REQUIREMENTS β
β [β] 4. GAP ANALYSIS β
β [β] 5. GENERATE β CURRENT β
β [ ] 6. TEST β
β [ ] 7. CONSTITUTION β
β [ ] 8. SPEC β
β [ ] 9. PLAN β
β [ ] 10. TASKS β
β [ ] 11. IMPLEMENT β
β [ ] 12. QA β
β [ ] 13. DELIVER β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ£
β Violations: NONE β
β Next: Generate missing skills (express-patterns, etc.) β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
For complex projects (4+ features), the workflow uses a sophisticated iteration pattern:
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β COMPLEX PROJECT FEATURE ITERATION β
β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β β CONSTITUTION (ONE for entire project - Phase 7) ββ
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β β FEATURE BREAKDOWN (Phase 7.5) ββ
β β β Extract features from requirements ββ
β β β Map dependencies between features ββ
β β β Order features by dependency graph ββ
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β β FOR EACH FEATURE (in dependency order): ββ
β β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β β β 8. SPEC (feature-specific) βββ
β β β 9. PLAN (feature-specific) βββ
β β β 10. TASKS (feature-specific) βββ
β β β 11. IMPLEMENT (TDD: tests first, then code) βββ
β β β 11.5 FEATURE QA (unit tests for this feature) βββ
β β β 11.6 INTER-FEATURE TESTS (all unit tests, if feature 2+) βββ
β β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β β β (next feature) ββ
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β β 12. INTEGRATION QA (Full test suite across all features) ββ
β β β Unit tests (all features) ββ
β β β Integration tests (feature interactions) ββ
β β β E2E tests (user journeys) ββ
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β β 13. DELIVER ββ
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Feature 1: User Auth β Spec β Plan β Tasks β Implement β Unit Tests β
Feature 2: Products β Spec β Plan β Tasks β Implement β Unit Tests β Inter-Feature Tests β
Feature 3: Cart β Spec β Plan β Tasks β Implement β Unit Tests β Inter-Feature Tests β
Feature 4: Orders β Spec β Plan β Tasks β Implement β Unit Tests β Inter-Feature Tests β
Feature 5: Payments β Spec β Plan β Tasks β Implement β Unit Tests β Inter-Feature Tests β
β
INTEGRATION QA (All features together)
β
DELIVER
# Project Name
## Features
- Feature 1
- Feature 2
## Technical
- Stack item 1
- Stack item 2# E-Commerce Platform
## Overview
A full-featured e-commerce platform for small businesses.
## Features
### User Management
- User registration with email verification
- Login with JWT authentication
- Password reset flow
- User profile management
### Product Catalog
- Product CRUD operations
- Category management
- Search with filters
- Image upload
### Shopping Cart
- Add/remove items
- Quantity management
- Persistent cart (database)
### Orders
- Checkout flow
- Order history
- Order status tracking
## Technical
### Backend
- Runtime: Node.js 20
- Framework: Express
- Database: PostgreSQL
- ORM: Prisma
- Auth: JWT + bcrypt
### Frontend
- Framework: Next.js 14
- Styling: Tailwind CSS
- State: Zustand
### Testing
- Unit: Jest
- E2E: Playwright
### Deployment
- Docker
- Railway/Vercel
## Constraints
- Must be mobile-responsive
- Must support 1000 concurrent users
- Must have 80%+ test coverageThe boilerplate comes with pre-loaded components that work out of the box:
| Agent | Model | Purpose |
|---|---|---|
| architect | Opus | System design decisions |
| planner | Opus | Creates implementation plans |
| security-reviewer | Opus | OWASP Top 10 checks |
| tdd-guide | Opus | Test-driven development |
| code-reviewer | Sonnet | Quality & security review |
| build-error-resolver | Sonnet | Fix build errors |
| e2e-runner | Sonnet | Playwright E2E tests |
| refactor-cleaner | Sonnet | Remove dead code |
| doc-updater | Sonnet | Update documentation |
| test-runner | Sonnet | Run tests |
| git-ops | Haiku | Git commits, pushes, status |
| file-ops | Haiku | File listing, searching, moving |
| format-checker | Haiku | Prettier, ESLint, formatting |
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β DYNAMIC MODEL SELECTION β
β β
β OPUS (Complex) SONNET (Medium) HAIKU (Light) β
β ββββββββββββββ βββββββββββββββ βββββββββββββ β
β β’ Architecture β’ Code writing β’ Git operations β
β β’ Security analysis β’ Code review β’ File operations β
β β’ Multi-phase planning β’ Test writing β’ Formatting β
β β’ Constitution β’ Build fixes β’ Simple validations β
β β
β Cost: $$$$$ Cost: $$$$ Cost: $$$ β
β ~10 calls/project ~100 calls/project ~50 calls/project β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
| Skill | What It Contains |
|---|---|
| coding-standards | TypeScript/JS/React patterns, immutability, file organization |
| backend-patterns | API design, services, repository pattern |
| testing-patterns | Jest/Vitest/Playwright, TDD workflow, 80% coverage |
| api-patterns | REST/GraphQL design, input validation |
| database-patterns | Prisma/SQL/migrations, query optimization |
| claudeception | Session learning, skill extraction |
| mcp-code-execution-template | MCP integration, code execution pattern |
| skill-gap-analyzer | Detect missing skills, auto-generate needed components |
| workflow-validator | Quality gate teacher, component utilization enforcement |
| component-quality-validator | Validate generated components (A-F grading) |
| speckit-initialization | Spec-Kit-Plus setup and verification |
| workflow-state-manager | Workflow state tracking and persistence |
| Plus 3 more technology-specific skills that may be generated based on your project requirements |
Core Autonomous Workflow:
| Command | What It Does |
|---|---|
/sp.autonomous |
Full autonomous build from requirements file |
/sp.specify |
Create feature specification |
/sp.plan |
Generate implementation plan |
/sp.tasks |
Break down into actionable tasks |
/sp.implement |
Execute implementation with TDD |
/sp.adr |
Create Architecture Decision Record |
/sp.phr |
Record Prompt History (automatic) |
/sp.constitution |
Create/update project constitution |
/sp.clarify |
Ask clarifying questions |
/sp.checklist |
Generate quality checklist |
/sp.analyze |
Analyze codebase structure |
/sp.reverse-engineer |
Generate docs from code |
/sp.git.commit_pr |
Commit changes and create PR |
/sp.taskstoissues |
Convert tasks to GitHub issues |
Workflow Validation:
| Command | What It Does |
|---|---|
/q-status |
Check workflow state - which phase you're at |
/q-validate |
Validate workflow order, detect violations, check component utilization |
/q-reset |
Reset workflow state for fresh start |
/validate-workflow |
Run full workflow validation with quality gates |
/validate-components |
Check component quality (A-F grading) |
Development Commands:
| Command | What It Does |
|---|---|
/plan |
Create implementation plan |
/tdd |
Test-driven development |
/code-review |
Security + quality review |
/build-fix |
Fix build errors incrementally |
/e2e |
Generate and run E2E tests |
/refactor-clean |
Remove dead code safely |
/test-coverage |
Analyze and improve coverage |
/update-codemaps |
Update architecture documentation |
/update-docs |
Sync documentation from source |
Plus the FPF (First Principles Framework) commands: /q0-init, /q1-hypothesize, /q2-verify, /q3-validate, /q4-audit, /q5-decide
Don't want full autonomous mode? Use individual commands:
# Start with a plan
> /plan I want to add user authentication
# Claude creates plan, WAITS for approval
> looks good, proceed
# Claude implements with TDD
# Then review
> /code-review
# Fix any issues
> /build-fix
# Commit
> commit these changes> /tdd
# Claude:
# 1. Writes failing test (RED)
# 2. Implements code (GREEN)
# 3. Refactors (IMPROVE)
# 4. Verifies 80%+ coverageautonomous-agent-boilerplate/
β
βββ CLAUDE.md # Instructions for Claude (Golden Rules)
βββ README.md # This documentation
βββ .mcp.json # MCP server configuration (6 servers)
β
βββ .claude/ # Claude Code configuration
β βββ settings.json # Permissions + environment variables
β βββ hooks.json # Automation hooks (PreToolUse/PostToolUse/Stop)
β βββ skill-rules.json # Dynamic skill activation rules (18 patterns)
β β
β βββ agents/ # 13 specialized agents
β β βββ architect.md # (opus) System design
β β βββ planner.md # (opus) Implementation planning
β β βββ security-reviewer.md # (opus) OWASP security analysis
β β βββ tdd-guide.md # (opus) Test-driven development
β β βββ code-reviewer.md # (sonnet) Code quality review
β β βββ build-error-resolver.md# (sonnet) Fix build errors
β β βββ e2e-runner.md # (sonnet) Playwright E2E tests
β β βββ refactor-cleaner.md # (sonnet) Dead code removal
β β βββ doc-updater.md # (sonnet) Documentation sync
β β βββ test-runner.md # (sonnet) Test execution
β β βββ git-ops.md # (haiku) Git operations
β β βββ file-ops.md # (haiku) File operations
β β βββ format-checker.md # (haiku) Prettier/ESLint
β β
β βββ commands/ # 15 slash commands
β β βββ sp.autonomous.md # Full autonomous build (57KB)
β β βββ q-status.md # Workflow status
β β βββ q-validate.md # Workflow validation
β β βββ q-reset.md # Reset workflow
β β βββ plan.md # Implementation planning
β β βββ tdd.md # Test-driven development
β β βββ code-review.md # Code review
β β βββ build-fix.md # Build error fixing
β β βββ e2e.md # E2E testing
β β βββ refactor-clean.md # Dead code cleanup
β β βββ test-coverage.md # Coverage analysis
β β βββ update-codemaps.md # Architecture docs
β β βββ update-docs.md # Documentation sync
β β βββ validate-workflow.md # Workflow validation
β β βββ validate-components.md # Component quality check
β β
β βββ skills/ # 10 reusable skills
β β βββ api-patterns/ # REST/GraphQL patterns
β β βββ backend-patterns/ # Backend architecture
β β βββ coding-standards/ # Code quality patterns
β β βββ database-patterns/ # Database/ORM patterns
β β βββ testing-patterns/ # Testing patterns
β β βββ claudeception/ # Session learning
β β βββ mcp-code-execution-template/ # MCP integration
β β βββ skill-gap-analyzer/ # Gap analysis
β β βββ workflow-validator/ # Quality gate + component utilization
β β βββ component-quality-validator/ # Production-readiness check
β β
β βββ rules/ # 8 governance rules
β β βββ agents.md # Agent orchestration
β β βββ coding-style.md # Immutability, file organization
β β βββ git-workflow.md # Conventional commits
β β βββ hooks.md # Hook system docs
β β βββ patterns.md # API/service patterns
β β βββ performance.md # Model selection
β β βββ security.md # OWASP Top 10
β β βββ testing.md # 80% coverage, TDD
β β
β βββ hooks/ # Hook scripts
β β βββ skill-activator.sh # Dynamic skill activation
β β βββ skill-enforcement-stop.sh # End-of-session enforcement
β β βββ claudeception-activator.sh # Session learning
β β
β βββ logs/ # Activity logs
β βββ agent-usage.log # Task tool invocations
β βββ skill-invocations.log # Skill() calls
β βββ skill-activations.log # Skill activator output
β βββ skill-enforcement.log # Enforcement decisions
β βββ tool-usage.log # Write/Edit operations
β βββ file-changes.log # File modifications
β
βββ .specify/ # Spec-Kit-Plus workflow artifacts
β βββ project-analysis.json # Existing project analysis
β βββ requirements-analysis.json # Requirements parsing
β βββ gap-analysis.json # Missing components
β βββ constitution.md # Project rules
β βββ spec.md # Specification
β βββ plan.md # Implementation plan
β βββ tasks.md # Task checklist
β βββ templates/ # Feature templates
β βββ validations/ # Phase validation reports
β
βββ requirements/ # Your requirements files
βββ my-app.md # Example requirements
| Rule | Enforcement |
|---|---|
| Immutability | No direct mutation allowed |
| File Size | Max 800 lines per file |
| Test Coverage | Minimum 80% |
| Security | OWASP Top 10 checked |
| Code Quality | Auto-formatted, reviewed |
The boilerplate includes comprehensive logging to track what's happening during autonomous builds.
All logs are stored in .claude/logs/:
| Log File | What It Tracks | Example Entry |
|---|---|---|
agent-usage.log |
Task tool invocations (agent spawning) | [2024-01-21T10:30:00] Agent task invoked |
skill-invocations.log |
Skill() tool calls | [2024-01-21T10:31:00] Skill invoked: coding-standards |
skill-activations.log |
Skill activator hook output | Prompt: "build api" | Matched: api-patterns |
skill-enforcement.log |
Enforcement decisions | MANDATORY skill not used: testing-patterns |
tool-usage.log |
Write/Edit tool calls | [2024-01-21T10:32:00] Tool: Edit | File: src/api.ts |
file-changes.log |
File modifications | [2024-01-21T10:32:00] File modified: src/api.ts |
# View recent skill invocations
tail -20 .claude/logs/skill-invocations.log
# Watch agent usage in real-time
tail -f .claude/logs/agent-usage.log
# Check which skills were activated
cat .claude/logs/skill-activations.log | grep "DETECTED MATCHES"
# See enforcement decisions
cat .claude/logs/skill-enforcement.log
# Count skill usage
wc -l .claude/logs/skill-invocations.logThe workflow-validator skill tracks if custom components are being used:
# Check component utilization during a build
claude "/q-validate"This shows:
- Skills Used: Which skills were invoked via
Skill(name) - Agents Used: Which agents were spawned via
Task(subagent_type) - Utilization %: Percentage of available components that were used
- Bypass Detection: Warning if general agent did work without using custom components
After each phase, validation reports are generated in .specify/validations/:
# List validation reports
ls .specify/validations/
# View a specific phase report
cat .specify/validations/phase-11-report.mdExample report structure:
# Phase 11 Validation Report
## Summary
| Field | Value |
|-------|-------|
| Phase | 11: IMPLEMENT |
| Grade | B |
| Score | 85/100 |
| Status | APPROVED |
## Component Utilization
| Category | Available | Used | Percentage |
|----------|-----------|------|------------|
| Skills | 10 | 4 | 40% |
| Agents | 13 | 3 | 23% |
## Issues Found
- Missing skill invocation: testing-patterns
## Decision
β
APPROVED - Proceeding to Phase 12If a phase is reset due to component bypass:
# Check reset counter
cat .specify/validations/reset-counter.json
# View bypass log
cat .specify/validations/bypass-log.jsonThe following environment variables control workflow behavior:
| Variable | Value | Purpose |
|---|---|---|
AUTONOMOUS_MODE |
true |
Enable full autonomous execution |
MAX_SELF_HEAL_RETRIES |
3 |
Max retry attempts per phase |
SKILL_ENFORCEMENT |
strict |
Enforce skill usage (strict/warn/off) |
Create .claude/skills/my-skill/SKILL.md:
---
name: my-skill
description: Description of what this skill does
allowed-tools: Read, Write, Edit, Bash
---
# My Custom Skill
## Patterns
...Create .claude/agents/my-agent.md:
---
name: my-agent
description: What this agent does
tools: Read, Write, Edit, Bash
model: sonnet
---
Instructions for the agent...Create .claude/commands/my-command.md:
---
description: What this command does
---
Instructions executed when /my-command is called...ls .claude/commands/
# Should show plan.md, tdd.md, sp.autonomous.md, etc.Fix: Ensure .claude/ folder is in your project.
> /build-fixSelf-heals up to 3 times, then asks for help.
Check .mcp.json has correct API keys:
{
"mcpServers": {
"github": {
"env": {
"GITHUB_PERSONAL_ACCESS_TOKEN": "your-token"
}
}
}
}Normal template: Pre-made files you adapt to your project. This boilerplate: Generates custom infrastructure for YOUR specific requirements.
Minimal requirements work, but more detail = better results. The boilerplate extracts technologies, features, and constraints from your requirements file.
Yes! Copy .claude/, CLAUDE.md, and .mcp.json to your project. Then use /plan for new features.
It self-heals up to 3 times. If still failing, it stops and reports what went wrong. You can then use manual commands (/plan, /tdd, /build-fix) to continue.
Yes! Edit .claude/rules/coding-style.md to change patterns, file size limits, naming conventions, etc.
MIT License - use freely in personal and commercial projects.
- Inspired by everything-claude-code by @affaan-m
- Built for Claude Code by Anthropic
Write requirements. Run one command. Ship code.
claude "/sp.autonomous requirements/my-app.md"