feat: basecamp-project skill

2026-04-24 20:00:33 +02:00
parent 0ad41acb03
commit 6e0e847299
211 changed files with 46029 additions and 2592 deletions
--- a/.pi/gsd/templates/DEBUG.md
+++ b/.pi/gsd/templates/DEBUG.md
@@ -0,0 +1,164 @@
+# Debug Template
+
+Template for `.planning/debug/[slug].md` - active debug session tracking.
+
+---
+
+## File Template
+
+```markdown
+---
+status: gathering | investigating | fixing | verifying | awaiting_human_verify | resolved
+trigger: "[verbatim user input]"
+created: [ISO timestamp]
+updated: [ISO timestamp]
+---
+
+## Current Focus
+<!-- OVERWRITE on each update - always reflects NOW -->
+
+hypothesis: [current theory being tested]
+test: [how testing it]
+expecting: [what result means if true/false]
+next_action: [immediate next step]
+
+## Symptoms
+<!-- Written during gathering, then immutable -->
+
+expected: [what should happen]
+actual: [what actually happens]
+errors: [error messages if any]
+reproduction: [how to trigger]
+started: [when it broke / always broken]
+
+## Eliminated
+<!-- APPEND only - prevents re-investigating after /new -->
+
+- hypothesis: [theory that was wrong]
+  evidence: [what disproved it]
+  timestamp: [when eliminated]
+
+## Evidence
+<!-- APPEND only - facts discovered during investigation -->
+
+- timestamp: [when found]
+  checked: [what was examined]
+  found: [what was observed]
+  implication: [what this means]
+
+## Resolution
+<!-- OVERWRITE as understanding evolves -->
+
+root_cause: [empty until found]
+fix: [empty until applied]
+verification: [empty until verified]
+files_changed: []
+```
+
+---
+
+<section_rules>
+
+**Frontmatter (status, trigger, timestamps):**
+- `status`: OVERWRITE - reflects current phase
+- `trigger`: IMMUTABLE - verbatim user input, never changes
+- `created`: IMMUTABLE - set once
+- `updated`: OVERWRITE - update on every change
+
+**Current Focus:**
+- OVERWRITE entirely on each update
+- Always reflects what the agent is doing RIGHT NOW
+- If the agent reads this after /new, it knows exactly where to resume
+- Fields: hypothesis, test, expecting, next_action
+
+**Symptoms:**
+- Written during initial gathering phase
+- IMMUTABLE after gathering complete
+- Reference point for what we're trying to fix
+- Fields: expected, actual, errors, reproduction, started
+
+**Eliminated:**
+- APPEND only - never remove entries
+- Prevents re-investigating dead ends after context reset
+- Each entry: hypothesis, evidence that disproved it, timestamp
+- Critical for efficiency across /new boundaries
+
+**Evidence:**
+- APPEND only - never remove entries
+- Facts discovered during investigation
+- Each entry: timestamp, what checked, what found, implication
+- Builds the case for root cause
+
+**Resolution:**
+- OVERWRITE as understanding evolves
+- May update multiple times as fixes are tried
+- Final state shows confirmed root cause and verified fix
+- Fields: root_cause, fix, verification, files_changed
+
+</section_rules>
+
+<lifecycle>
+
+**Creation:** Immediately when /gsd-debug is called
+- Create file with trigger from user input
+- Set status to "gathering"
+- Current Focus: next_action = "gather symptoms"
+- Symptoms: empty, to be filled
+
+**During symptom gathering:**
+- Update Symptoms section as user answers questions
+- Update Current Focus with each question
+- When complete: status → "investigating"
+
+**During investigation:**
+- OVERWRITE Current Focus with each hypothesis
+- APPEND to Evidence with each finding
+- APPEND to Eliminated when hypothesis disproved
+- Update timestamp in frontmatter
+
+**During fixing:**
+- status → "fixing"
+- Update Resolution.root_cause when confirmed
+- Update Resolution.fix when applied
+- Update Resolution.files_changed
+
+**During verification:**
+- status → "verifying"
+- Update Resolution.verification with results
+- If verification fails: status → "investigating", try again
+
+**After self-verification passes:**
+- status -> "awaiting_human_verify"
+- Request explicit user confirmation in a checkpoint
+- Do NOT move file to resolved yet
+
+**On resolution:**
+- status → "resolved"
+- Move file to .planning/debug/resolved/ (only after user confirms fix)
+
+</lifecycle>
+
+<resume_behavior>
+
+When the agent reads this file after /new:
+
+1. Parse frontmatter → know status
+2. Read Current Focus → know exactly what was happening
+3. Read Eliminated → know what NOT to retry
+4. Read Evidence → know what's been learned
+5. Continue from next_action
+
+The file IS the debugging brain. the agent should be able to resume perfectly from any interruption point.
+
+</resume_behavior>
+
+<size_constraint>
+
+Keep debug files focused:
+- Evidence entries: 1-2 lines each, just the facts
+- Eliminated: brief - hypothesis + why it failed
+- No narrative prose - structured data only
+
+If evidence grows very large (10+ entries), consider whether you're going in circles. Check Eliminated to ensure you're not re-treading.
+
+</size_constraint>
--- a/.pi/gsd/templates/UAT.md
+++ b/.pi/gsd/templates/UAT.md
@@ -0,0 +1,265 @@
+# UAT Template
+
+Template for `.planning/phases/XX-name/{phase_num}-UAT.md` - persistent UAT session tracking.
+
+---
+
+## File Template
+
+```markdown
+---
+status: testing | partial | complete | diagnosed
+phase: XX-name
+source: [list of SUMMARY.md files tested]
+started: [ISO timestamp]
+updated: [ISO timestamp]
+---
+
+## Current Test
+<!-- OVERWRITE each test - shows where we are -->
+
+number: [N]
+name: [test name]
+expected: |
+  [what user should observe]
+awaiting: user response
+
+## Tests
+
+### 1. [Test Name]
+expected: [observable behavior - what user should see]
+result: [pending]
+
+### 2. [Test Name]
+expected: [observable behavior]
+result: pass
+
+### 3. [Test Name]
+expected: [observable behavior]
+result: issue
+reported: "[verbatim user response]"
+severity: major
+
+### 4. [Test Name]
+expected: [observable behavior]
+result: skipped
+reason: [why skipped]
+
+### 5. [Test Name]
+expected: [observable behavior]
+result: blocked
+blocked_by: server | physical-device | release-build | third-party | prior-phase
+reason: [why blocked]
+
+...
+
+## Summary
+
+total: [N]
+passed: [N]
+issues: [N]
+pending: [N]
+skipped: [N]
+blocked: [N]
+
+## Gaps
+
+<!-- YAML format for plan-phase --gaps consumption -->
+- truth: "[expected behavior from test]"
+  status: failed
+  reason: "User reported: [verbatim response]"
+  severity: blocker | major | minor | cosmetic
+  test: [N]
+  root_cause: ""     # Filled by diagnosis
+  artifacts: []      # Filled by diagnosis
+  missing: []        # Filled by diagnosis
+  debug_session: ""  # Filled by diagnosis
+```
+
+---
+
+<section_rules>
+
+**Frontmatter:**
+- `status`: OVERWRITE - "testing", "partial", or "complete"
+- `phase`: IMMUTABLE - set on creation
+- `source`: IMMUTABLE - SUMMARY files being tested
+- `started`: IMMUTABLE - set on creation
+- `updated`: OVERWRITE - update on every change
+
+**Current Test:**
+- OVERWRITE entirely on each test transition
+- Shows which test is active and what's awaited
+- On completion: "[testing complete]"
+
+**Tests:**
+- Each test: OVERWRITE result field when user responds
+- `result` values: [pending], pass, issue, skipped, blocked
+- If issue: add `reported` (verbatim) and `severity` (inferred)
+- If skipped: add `reason` if provided
+- If blocked: add `blocked_by` (tag) and `reason` (if provided)
+
+**Summary:**
+- OVERWRITE counts after each response
+- Tracks: total, passed, issues, pending, skipped
+
+**Gaps:**
+- APPEND only when issue found (YAML format)
+- After diagnosis: fill `root_cause`, `artifacts`, `missing`, `debug_session`
+- This section feeds directly into /gsd-plan-phase --gaps
+
+</section_rules>
+
+<diagnosis_lifecycle>
+
+**After testing complete (status: complete), if gaps exist:**
+
+1. User runs diagnosis (from verify-work offer or manually)
+2. diagnose-issues workflow spawns parallel debug agents
+3. Each agent investigates one gap, returns root cause
+4. UAT.md Gaps section updated with diagnosis:
+   - Each gap gets `root_cause`, `artifacts`, `missing`, `debug_session` filled
+5. status → "diagnosed"
+6. Ready for /gsd-plan-phase --gaps with root causes
+
+**After diagnosis:**
+```yaml
+## Gaps
+
+- truth: "Comment appears immediately after submission"
+  status: failed
+  reason: "User reported: works but doesn't show until I refresh the page"
+  severity: major
+  test: 2
+  root_cause: "useEffect in CommentList.tsx missing commentCount dependency"
+  artifacts:
+    - path: "src/components/CommentList.tsx"
+      issue: "useEffect missing dependency"
+  missing:
+    - "Add commentCount to useEffect dependency array"
+  debug_session: ".planning/debug/comment-not-refreshing.md"
+```
+
+</diagnosis_lifecycle>
+
+<lifecycle>
+
+**Creation:** When /gsd-verify-work starts new session
+- Extract tests from SUMMARY.md files
+- Set status to "testing"
+- Current Test points to test 1
+- All tests have result: [pending]
+
+**During testing:**
+- Present test from Current Test section
+- User responds with pass confirmation or issue description
+- Update test result (pass/issue/skipped)
+- Update Summary counts
+- If issue: append to Gaps section (YAML format), infer severity
+- Move Current Test to next pending test
+
+**On completion:**
+- status → "complete"
+- Current Test → "[testing complete]"
+- Commit file
+- Present summary with next steps
+
+**Partial completion:**
+- status → "partial" (if pending, blocked, or unresolved skipped tests remain)
+- Current Test → "[testing paused - {N} items outstanding]"
+- Commit file
+- Present summary with outstanding items highlighted
+
+**Resuming partial session:**
+- `/gsd-verify-work {phase}` picks up from first pending/blocked test
+- When all items resolved, status advances to "complete"
+
+**Resume after /new:**
+1. Read frontmatter → know phase and status
+2. Read Current Test → know where we are
+3. Find first [pending] result → continue from there
+4. Summary shows progress so far
+
+</lifecycle>
+
+<severity_guide>
+
+Severity is INFERRED from user's natural language, never asked.
+
+| User describes                                         | Infer    |
+| ------------------------------------------------------ | -------- |
+| Crash, error, exception, fails completely, unusable    | blocker  |
+| Doesn't work, nothing happens, wrong behavior, missing | major    |
+| Works but..., slow, weird, minor, small issue          | minor    |
+| Color, font, spacing, alignment, visual, looks off     | cosmetic |
+
+Default: **major** (safe default, user can clarify if wrong)
+
+</severity_guide>
+
+<good_example>
+```markdown
+---
+status: diagnosed
+phase: 04-comments
+source: 04-01-SUMMARY.md, 04-02-SUMMARY.md
+started: 2025-01-15T10:30:00Z
+updated: 2025-01-15T10:45:00Z
+---
+
+## Current Test
+
+[testing complete]
+
+## Tests
+
+### 1. View Comments on Post
+expected: Comments section expands, shows count and comment list
+result: pass
+
+### 2. Create Top-Level Comment
+expected: Submit comment via rich text editor, appears in list with author info
+result: issue
+reported: "works but doesn't show until I refresh the page"
+severity: major
+
+### 3. Reply to a Comment
+expected: Click Reply, inline composer appears, submit shows nested reply
+result: pass
+
+### 4. Visual Nesting
+expected: 3+ level thread shows indentation, left borders, caps at reasonable depth
+result: pass
+
+### 5. Delete Own Comment
+expected: Click delete on own comment, removed or shows [deleted] if has replies
+result: pass
+
+### 6. Comment Count
+expected: Post shows accurate count, increments when adding comment
+result: pass
+
+## Summary
+
+total: 6
+passed: 5
+issues: 1
+pending: 0
+skipped: 0
+
+## Gaps
+
+- truth: "Comment appears immediately after submission in list"
+  status: failed
+  reason: "User reported: works but doesn't show until I refresh the page"
+  severity: major
+  test: 2
+  root_cause: "useEffect in CommentList.tsx missing commentCount dependency"
+  artifacts:
+    - path: "src/components/CommentList.tsx"
+      issue: "useEffect missing dependency"
+  missing:
+    - "Add commentCount to useEffect dependency array"
+  debug_session: ".planning/debug/comment-not-refreshing.md"
+```
+</good_example>
--- a/.pi/gsd/templates/UI-SPEC.md
+++ b/.pi/gsd/templates/UI-SPEC.md
@@ -0,0 +1,100 @@
+---
+phase: {N}
+slug: {phase-slug}
+status: draft
+shadcn_initialized: false
+preset: none
+created: {date}
+---
+
+# Phase {N} - UI Design Contract
+
+> Visual and interaction contract for frontend phases. Generated by gsd-ui-researcher, verified by gsd-ui-checker.
+
+---
+
+## Design System
+
+| Property          | Value                               |
+| ----------------- | ----------------------------------- |
+| Tool              | {shadcn / none}                     |
+| Preset            | {preset string or "not applicable"} |
+| Component library | {radix / base-ui / none}            |
+| Icon library      | {library}                           |
+| Font              | {font}                              |
+
+---
+
+## Spacing Scale
+
+Declared values (must be multiples of 4):
+
+| Token | Value | Usage                     |
+| ----- | ----- | ------------------------- |
+| xs    | 4px   | Icon gaps, inline padding |
+| sm    | 8px   | Compact element spacing   |
+| md    | 16px  | Default element spacing   |
+| lg    | 24px  | Section padding           |
+| xl    | 32px  | Layout gaps               |
+| 2xl   | 48px  | Major section breaks      |
+| 3xl   | 64px  | Page-level spacing        |
+
+Exceptions: {list any, or "none"}
+
+---
+
+## Typography
+
+| Role    | Size | Weight   | Line Height |
+| ------- | ---- | -------- | ----------- |
+| Body    | {px} | {weight} | {ratio}     |
+| Label   | {px} | {weight} | {ratio}     |
+| Heading | {px} | {weight} | {ratio}     |
+| Display | {px} | {weight} | {ratio}     |
+
+---
+
+## Color
+
+| Role            | Value | Usage                         |
+| --------------- | ----- | ----------------------------- |
+| Dominant (60%)  | {hex} | Background, surfaces          |
+| Secondary (30%) | {hex} | Cards, sidebar, nav           |
+| Accent (10%)    | {hex} | {list specific elements only} |
+| Destructive     | {hex} | Destructive actions only      |
+
+Accent reserved for: {explicit list - never "all interactive elements"}
+
+---
+
+## Copywriting Contract
+
+| Element                  | Copy                               |
+| ------------------------ | ---------------------------------- |
+| Primary CTA              | {specific verb + noun}             |
+| Empty state heading      | {copy}                             |
+| Empty state body         | {copy + next step}                 |
+| Error state              | {problem + solution path}          |
+| Destructive confirmation | {action name}: {confirmation copy} |
+
+---
+
+## Registry Safety
+
+| Registry           | Blocks Used | Safety Gate                 |
+| ------------------ | ----------- | --------------------------- |
+| shadcn official    | {list}      | not required                |
+| {third-party name} | {list}      | shadcn view + diff required |
+
+---
+
+## Checker Sign-Off
+
+- [ ] Dimension 1 Copywriting: PASS
+- [ ] Dimension 2 Visuals: PASS
+- [ ] Dimension 3 Color: PASS
+- [ ] Dimension 4 Typography: PASS
+- [ ] Dimension 5 Spacing: PASS
+- [ ] Dimension 6 Registry Safety: PASS
+
+**Approval:** {pending / approved YYYY-MM-DD}
--- a/.pi/gsd/templates/VALIDATION.md
+++ b/.pi/gsd/templates/VALIDATION.md
@@ -0,0 +1,76 @@
+---
+phase: {N}
+slug: {phase-slug}
+status: draft
+nyquist_compliant: false
+wave_0_complete: false
+created: {date}
+---
+
+# Phase {N} - Validation Strategy
+
+> Per-phase validation contract for feedback sampling during execution.
+
+---
+
+## Test Infrastructure
+
+| Property               | Value                                               |
+| ---------------------- | --------------------------------------------------- |
+| **Framework**          | {pytest 7.x / jest 29.x / vitest / go test / other} |
+| **Config file**        | {path or "none - Wave 0 installs"}                  |
+| **Quick run command**  | `{quick command}`                                   |
+| **Full suite command** | `{full command}`                                    |
+| **Estimated runtime**  | ~{N} seconds                                        |
+
+---
+
+## Sampling Rate
+
+- **After every task commit:** Run `{quick run command}`
+- **After every plan wave:** Run `{full suite command}`
+- **Before `/gsd-verify-work`:** Full suite must be green
+- **Max feedback latency:** {N} seconds
+
+---
+
+## Per-Task Verification Map
+
+| Task ID   | Plan | Wave | Requirement | Test Type | Automated Command | File Exists | Status    |
+| --------- | ---- | ---- | ----------- | --------- | ----------------- | ----------- | --------- |
+| {N}-01-01 | 01   | 1    | REQ-{XX}    | unit      | `{command}`       | ✅ / ❌ W0    | ⬜ pending |
+
+*Status: ⬜ pending · ✅ green · ❌ red · ⚠️ flaky*
+
+---
+
+## Wave 0 Requirements
+
+- [ ] `{tests/test_file.py}` - stubs for REQ-{XX}
+- [ ] `{tests/conftest.py}` - shared fixtures
+- [ ] `{framework install}` - if no framework detected
+
+*If none: "Existing infrastructure covers all phase requirements."*
+
+---
+
+## Manual-Only Verifications
+
+| Behavior   | Requirement | Why Manual | Test Instructions |
+| ---------- | ----------- | ---------- | ----------------- |
+| {behavior} | REQ-{XX}    | {reason}   | {steps}           |
+
+*If none: "All phase behaviors have automated verification."*
+
+---
+
+## Validation Sign-Off
+
+- [ ] All tasks have `<automated>` verify or Wave 0 dependencies
+- [ ] Sampling continuity: no 3 consecutive tasks without automated verify
+- [ ] Wave 0 covers all MISSING references
+- [ ] No watch-mode flags
+- [ ] Feedback latency < {N}s
+- [ ] `nyquist_compliant: true` set in frontmatter
+
+**Approval:** {pending / approved YYYY-MM-DD}
--- a/.pi/gsd/templates/claude-md.md
+++ b/.pi/gsd/templates/claude-md.md
@@ -0,0 +1,122 @@
+# GEMINI.md Template
+
+Template for project-root `GEMINI.md` - auto-generated by `gsd-tools generate-claude-md`.
+
+Contains 6 marker-bounded sections. Each section is independently updatable.
+The `generate-claude-md` subcommand manages 5 sections (project, stack, conventions, architecture, workflow enforcement).
+The profile section is managed exclusively by `generate-claude-profile`.
+
+---
+
+## Section Templates
+
+### Project Section
+```
+<!-- GSD:project-start source:PROJECT.md -->
+## Project
+
+{{project_content}}
+<!-- GSD:project-end -->
+```
+
+**Fallback text:**
+```
+Project not yet initialized. Run /gsd-new-project to set up.
+```
+
+### Stack Section
+```
+<!-- GSD:stack-start source:STACK.md -->
+## Technology Stack
+
+{{stack_content}}
+<!-- GSD:stack-end -->
+```
+
+**Fallback text:**
+```
+Technology stack not yet documented. Will populate after codebase mapping or first phase.
+```
+
+### Conventions Section
+```
+<!-- GSD:conventions-start source:CONVENTIONS.md -->
+## Conventions
+
+{{conventions_content}}
+<!-- GSD:conventions-end -->
+```
+
+**Fallback text:**
+```
+Conventions not yet established. Will populate as patterns emerge during development.
+```
+
+### Architecture Section
+```
+<!-- GSD:architecture-start source:ARCHITECTURE.md -->
+## Architecture
+
+{{architecture_content}}
+<!-- GSD:architecture-end -->
+```
+
+**Fallback text:**
+```
+Architecture not yet mapped. Follow existing patterns found in the codebase.
+```
+
+### Workflow Enforcement Section
+```
+<!-- GSD:workflow-start source:GSD defaults -->
+## GSD Workflow Enforcement
+
+Before using Edit, Write, or other file-changing tools, start work through a GSD command so planning artifacts and execution context stay in sync.
+
+Use these entry points:
+- `/gsd-quick` for small fixes, doc updates, and ad-hoc tasks
+- `/gsd-debug` for investigation and bug fixing
+- `/gsd-execute-phase` for planned phase work
+
+Do not make direct repo edits outside a GSD workflow unless the user explicitly asks to bypass it.
+<!-- GSD:workflow-end -->
+```
+
+### Profile Section (Placeholder Only)
+```
+<!-- GSD:profile-start -->
+## Developer Profile
+
+> Profile not yet configured. Run `/gsd-profile-user` to generate your developer profile.
+> This section is managed by `generate-claude-profile` - do not edit manually.
+<!-- GSD:profile-end -->
+```
+
+**Note:** This section is NOT managed by `generate-claude-md`. It is managed exclusively
+by `generate-claude-profile`. The placeholder above is only used when creating a new
+GEMINI.md file and no profile section exists yet.
+
+---
+
+## Section Ordering
+
+1. **Project** - Identity and purpose (what this project is)
+2. **Stack** - Technology choices (what tools are used)
+3. **Conventions** - Code patterns and rules (how code is written)
+4. **Architecture** - System structure (how components fit together)
+5. **Workflow Enforcement** - Default GSD entry points for file-changing work
+6. **Profile** - Developer behavioral preferences (how to interact)
+
+## Marker Format
+
+- Start: `<!-- GSD:{name}-start source:{file} -->`
+- End: `<!-- GSD:{name}-end -->`
+- Source attribute enables targeted updates when source files change
+- Partial match on start marker (without closing `-->`) for detection
+
+## Fallback Behavior
+
+When a source file is missing, fallback text provides Claude-actionable guidance:
+- Guides the agent's behavior in the absence of data
+- Not placeholder ads or "missing" notices
+- Each fallback tells the agent what to do, not just what's absent
--- a/.pi/gsd/templates/codebase/architecture.md
+++ b/.pi/gsd/templates/codebase/architecture.md
@@ -0,0 +1,255 @@
+# Architecture Template
+
+Template for `.planning/codebase/ARCHITECTURE.md` - captures conceptual code organization.
+
+**Purpose:** Document how the code is organized at a conceptual level. Complements STRUCTURE.md (which shows physical file locations).
+
+---
+
+## File Template
+
+```markdown
+# Architecture
+
+**Analysis Date:** [YYYY-MM-DD]
+
+## Pattern Overview
+
+**Overall:** [Pattern name: e.g., "Monolithic CLI", "Serverless API", "Full-stack MVC"]
+
+**Key Characteristics:**
+- [Characteristic 1: e.g., "Single executable"]
+- [Characteristic 2: e.g., "Stateless request handling"]
+- [Characteristic 3: e.g., "Event-driven"]
+
+## Layers
+
+[Describe the conceptual layers and their responsibilities]
+
+**[Layer Name]:**
+- Purpose: [What this layer does]
+- Contains: [Types of code: e.g., "route handlers", "business logic"]
+- Depends on: [What it uses: e.g., "data layer only"]
+- Used by: [What uses it: e.g., "API routes"]
+
+**[Layer Name]:**
+- Purpose: [What this layer does]
+- Contains: [Types of code]
+- Depends on: [What it uses]
+- Used by: [What uses it]
+
+## Data Flow
+
+[Describe the typical request/execution lifecycle]
+
+**[Flow Name] (e.g., "HTTP Request", "CLI Command", "Event Processing"):**
+
+1. [Entry point: e.g., "User runs command"]
+2. [Processing step: e.g., "Router matches path"]
+3. [Processing step: e.g., "Controller validates input"]
+4. [Processing step: e.g., "Service executes logic"]
+5. [Output: e.g., "Response returned"]
+
+**State Management:**
+- [How state is handled: e.g., "Stateless - no persistent state", "Database per request", "In-memory cache"]
+
+## Key Abstractions
+
+[Core concepts/patterns used throughout the codebase]
+
+**[Abstraction Name]:**
+- Purpose: [What it represents]
+- Examples: [e.g., "UserService, ProjectService"]
+- Pattern: [e.g., "Singleton", "Factory", "Repository"]
+
+**[Abstraction Name]:**
+- Purpose: [What it represents]
+- Examples: [Concrete examples]
+- Pattern: [Pattern used]
+
+## Entry Points
+
+[Where execution begins]
+
+**[Entry Point]:**
+- Location: [Brief: e.g., "src/index.ts", "API Gateway triggers"]
+- Triggers: [What invokes it: e.g., "CLI invocation", "HTTP request"]
+- Responsibilities: [What it does: e.g., "Parse args, route to command"]
+
+## Error Handling
+
+**Strategy:** [How errors are handled: e.g., "Exception bubbling to top-level handler", "Per-route error middleware"]
+
+**Patterns:**
+- [Pattern: e.g., "try/catch at controller level"]
+- [Pattern: e.g., "Error codes returned to user"]
+
+## Cross-Cutting Concerns
+
+[Aspects that affect multiple layers]
+
+**Logging:**
+- [Approach: e.g., "Winston logger, injected per-request"]
+
+**Validation:**
+- [Approach: e.g., "Zod schemas at API boundary"]
+
+**Authentication:**
+- [Approach: e.g., "JWT middleware on protected routes"]
+
+---
+
+*Architecture analysis: [date]*
+*Update when major patterns change*
+```
+
+<good_examples>
+```markdown
+# Architecture
+
+**Analysis Date:** 2025-01-20
+
+## Pattern Overview
+
+**Overall:** CLI Application with Plugin System
+
+**Key Characteristics:**
+- Single executable with subcommands
+- Plugin-based extensibility
+- File-based state (no database)
+- Synchronous execution model
+
+## Layers
+
+**Command Layer:**
+- Purpose: Parse user input and route to appropriate handler
+- Contains: Command definitions, argument parsing, help text
+- Location: `src/commands/*.ts`
+- Depends on: Service layer for business logic
+- Used by: CLI entry point (`src/index.ts`)
+
+**Service Layer:**
+- Purpose: Core business logic
+- Contains: FileService, TemplateService, InstallService
+- Location: `src/services/*.ts`
+- Depends on: File system utilities, external tools
+- Used by: Command handlers
+
+**Utility Layer:**
+- Purpose: Shared helpers and abstractions
+- Contains: File I/O wrappers, path resolution, string formatting
+- Location: `src/utils/*.ts`
+- Depends on: Node.js built-ins only
+- Used by: Service layer
+
+## Data Flow
+
+**CLI Command Execution:**
+
+1. User runs: `gsd new-project`
+2. Commander parses args and flags
+3. Command handler invoked (`src/commands/new-project.ts`)
+4. Handler calls service methods (`src/services/project.ts` → `create()`)
+5. Service reads templates, processes files, writes output
+6. Results logged to console
+7. Process exits with status code
+
+**State Management:**
+- File-based: All state lives in `.planning/` directory
+- No persistent in-memory state
+- Each command execution is independent
+
+## Key Abstractions
+
+**Service:**
+- Purpose: Encapsulate business logic for a domain
+- Examples: `src/services/file.ts`, `src/services/template.ts`, `src/services/project.ts`
+- Pattern: Singleton-like (imported as modules, not instantiated)
+
+**Command:**
+- Purpose: CLI command definition
+- Examples: `src/commands/new-project.ts`, `src/commands/plan-phase.ts`
+- Pattern: Commander.js command registration
+
+**Template:**
+- Purpose: Reusable document structures
+- Examples: PROJECT.md, PLAN.md templates
+- Pattern: Markdown files with substitution variables
+
+## Entry Points
+
+**CLI Entry:**
+- Location: `src/index.ts`
+- Triggers: User runs `gsd <command>`
+- Responsibilities: Register commands, parse args, display help
+
+**Commands:**
+- Location: `src/commands/*.ts`
+- Triggers: Matched command from CLI
+- Responsibilities: Validate input, call services, format output
+
+## Error Handling
+
+**Strategy:** Throw exceptions, catch at command level, log and exit
+
+**Patterns:**
+- Services throw Error with descriptive messages
+- Command handlers catch, log error to stderr, exit(1)
+- Validation errors shown before execution (fail fast)
+
+## Cross-Cutting Concerns
+
+**Logging:**
+- Console.log for normal output
+- Console.error for errors
+- Chalk for colored output
+
+**Validation:**
+- Zod schemas for config file parsing
+- Manual validation in command handlers
+- Fail fast on invalid input
+
+**File Operations:**
+- FileService abstraction over fs-extra
+- All paths validated before operations
+- Atomic writes (temp file + rename)
+
+---
+
+*Architecture analysis: 2025-01-20*
+*Update when major patterns change*
+```
+</good_examples>
+
+<guidelines>
+**What belongs in ARCHITECTURE.md:**
+- Overall architectural pattern (monolith, microservices, layered, etc.)
+- Conceptual layers and their relationships
+- Data flow / request lifecycle
+- Key abstractions and patterns
+- Entry points
+- Error handling strategy
+- Cross-cutting concerns (logging, auth, validation)
+
+**What does NOT belong here:**
+- Exhaustive file listings (that's STRUCTURE.md)
+- Technology choices (that's STACK.md)
+- Line-by-line code walkthrough (defer to code reading)
+- Implementation details of specific features
+
+**File paths ARE welcome:**
+Include file paths as concrete examples of abstractions. Use backtick formatting: `src/services/user.ts`. This makes the architecture document actionable for the agent when planning.
+
+**When filling this template:**
+- Read main entry points (index, server, main)
+- Identify layers by reading imports/dependencies
+- Trace a typical request/command execution
+- Note recurring patterns (services, controllers, repositories)
+- Keep descriptions conceptual, not mechanical
+
+**Useful for phase planning when:**
+- Adding new features (where does it fit in the layers?)
+- Refactoring (understanding current patterns)
+- Identifying where to add code (which layer handles X?)
+- Understanding dependencies between components
+</guidelines>
--- a/.pi/gsd/templates/codebase/concerns.md
+++ b/.pi/gsd/templates/codebase/concerns.md
@@ -0,0 +1,310 @@
+# Codebase Concerns Template
+
+Template for `.planning/codebase/CONCERNS.md` - captures known issues and areas requiring care.
+
+**Purpose:** Surface actionable warnings about the codebase. Focused on "what to watch out for when making changes."
+
+---
+
+## File Template
+
+```markdown
+# Codebase Concerns
+
+**Analysis Date:** [YYYY-MM-DD]
+
+## Tech Debt
+
+**[Area/Component]:**
+- Issue: [What's the shortcut/workaround]
+- Why: [Why it was done this way]
+- Impact: [What breaks or degrades because of it]
+- Fix approach: [How to properly address it]
+
+**[Area/Component]:**
+- Issue: [What's the shortcut/workaround]
+- Why: [Why it was done this way]
+- Impact: [What breaks or degrades because of it]
+- Fix approach: [How to properly address it]
+
+## Known Bugs
+
+**[Bug description]:**
+- Symptoms: [What happens]
+- Trigger: [How to reproduce]
+- Workaround: [Temporary mitigation if any]
+- Root cause: [If known]
+- Blocked by: [If waiting on something]
+
+**[Bug description]:**
+- Symptoms: [What happens]
+- Trigger: [How to reproduce]
+- Workaround: [Temporary mitigation if any]
+- Root cause: [If known]
+
+## Security Considerations
+
+**[Area requiring security care]:**
+- Risk: [What could go wrong]
+- Current mitigation: [What's in place now]
+- Recommendations: [What should be added]
+
+**[Area requiring security care]:**
+- Risk: [What could go wrong]
+- Current mitigation: [What's in place now]
+- Recommendations: [What should be added]
+
+## Performance Bottlenecks
+
+**[Slow operation/endpoint]:**
+- Problem: [What's slow]
+- Measurement: [Actual numbers: "500ms p95", "2s load time"]
+- Cause: [Why it's slow]
+- Improvement path: [How to speed it up]
+
+**[Slow operation/endpoint]:**
+- Problem: [What's slow]
+- Measurement: [Actual numbers]
+- Cause: [Why it's slow]
+- Improvement path: [How to speed it up]
+
+## Fragile Areas
+
+**[Component/Module]:**
+- Why fragile: [What makes it break easily]
+- Common failures: [What typically goes wrong]
+- Safe modification: [How to change it without breaking]
+- Test coverage: [Is it tested? Gaps?]
+
+**[Component/Module]:**
+- Why fragile: [What makes it break easily]
+- Common failures: [What typically goes wrong]
+- Safe modification: [How to change it without breaking]
+- Test coverage: [Is it tested? Gaps?]
+
+## Scaling Limits
+
+**[Resource/System]:**
+- Current capacity: [Numbers: "100 req/sec", "10k users"]
+- Limit: [Where it breaks]
+- Symptoms at limit: [What happens]
+- Scaling path: [How to increase capacity]
+
+## Dependencies at Risk
+
+**[Package/Service]:**
+- Risk: [e.g., "deprecated", "unmaintained", "breaking changes coming"]
+- Impact: [What breaks if it fails]
+- Migration plan: [Alternative or upgrade path]
+
+## Missing Critical Features
+
+**[Feature gap]:**
+- Problem: [What's missing]
+- Current workaround: [How users cope]
+- Blocks: [What can't be done without it]
+- Implementation complexity: [Rough effort estimate]
+
+## Test Coverage Gaps
+
+**[Untested area]:**
+- What's not tested: [Specific functionality]
+- Risk: [What could break unnoticed]
+- Priority: [High/Medium/Low]
+- Difficulty to test: [Why it's not tested yet]
+
+---
+
+*Concerns audit: [date]*
+*Update as issues are fixed or new ones discovered*
+```
+
+<good_examples>
+```markdown
+# Codebase Concerns
+
+**Analysis Date:** 2025-01-20
+
+## Tech Debt
+
+**Database queries in React components:**
+- Issue: Direct Supabase queries in 15+ page components instead of server actions
+- Files: `app/dashboard/page.tsx`, `app/profile/page.tsx`, `app/courses/[id]/page.tsx`, `app/settings/page.tsx` (and 11 more in `app/`)
+- Why: Rapid prototyping during MVP phase
+- Impact: Can't implement RLS properly, exposes DB structure to client
+- Fix approach: Move all queries to server actions in `app/actions/`, add proper RLS policies
+
+**Manual webhook signature validation:**
+- Issue: Copy-pasted Stripe webhook verification code in 3 different endpoints
+- Files: `app/api/webhooks/stripe/route.ts`, `app/api/webhooks/checkout/route.ts`, `app/api/webhooks/subscription/route.ts`
+- Why: Each webhook added ad-hoc without abstraction
+- Impact: Easy to miss verification in new webhooks (security risk)
+- Fix approach: Create shared `lib/stripe/validate-webhook.ts` middleware
+
+## Known Bugs
+
+**Race condition in subscription updates:**
+- Symptoms: User shows as "free" tier for 5-10 seconds after successful payment
+- Trigger: Fast navigation after Stripe checkout redirect, before webhook processes
+- Files: `app/checkout/success/page.tsx` (redirect handler), `app/api/webhooks/stripe/route.ts` (webhook)
+- Workaround: Stripe webhook eventually updates status (self-heals)
+- Root cause: Webhook processing slower than user navigation, no optimistic UI update
+- Fix: Add polling in `app/checkout/success/page.tsx` after redirect
+
+**Inconsistent session state after logout:**
+- Symptoms: User redirected to /dashboard after logout instead of /login
+- Trigger: Logout via button in mobile nav (desktop works fine)
+- File: `components/MobileNav.tsx` (line ~45, logout handler)
+- Workaround: Manual URL navigation to /login works
+- Root cause: Mobile nav component not awaiting supabase.auth.signOut()
+- Fix: Add await to logout handler in `components/MobileNav.tsx`
+
+## Security Considerations
+
+**Admin role check client-side only:**
+- Risk: Admin dashboard pages check isAdmin from Supabase client, no server verification
+- Files: `app/admin/page.tsx`, `app/admin/users/page.tsx`, `components/AdminGuard.tsx`
+- Current mitigation: None (relying on UI hiding)
+- Recommendations: Add middleware to admin routes in `middleware.ts`, verify role server-side
+
+**Unvalidated file uploads:**
+- Risk: Users can upload any file type to avatar bucket (no size/type validation)
+- File: `components/AvatarUpload.tsx` (upload handler)
+- Current mitigation: Supabase bucket limits to 2MB (configured in dashboard)
+- Recommendations: Add file type validation (image/* only) in `lib/storage/validate.ts`
+
+## Performance Bottlenecks
+
+**/api/courses endpoint:**
+- Problem: Fetching all courses with nested lessons and authors
+- File: `app/api/courses/route.ts`
+- Measurement: 1.2s p95 response time with 50+ courses
+- Cause: N+1 query pattern (separate query per course for lessons)
+- Improvement path: Use Prisma include to eager-load lessons in `lib/db/courses.ts`, add Redis caching
+
+**Dashboard initial load:**
+- Problem: Waterfall of 5 serial API calls on mount
+- File: `app/dashboard/page.tsx`
+- Measurement: 3.5s until interactive on slow 3G
+- Cause: Each component fetches own data independently
+- Improvement path: Convert to Server Component with single parallel fetch
+
+## Fragile Areas
+
+**Authentication middleware chain:**
+- File: `middleware.ts`
+- Why fragile: 4 different middleware functions run in specific order (auth -> role -> subscription -> logging)
+- Common failures: Middleware order change breaks everything, hard to debug
+- Safe modification: Add tests before changing order, document dependencies in comments
+- Test coverage: No integration tests for middleware chain (only unit tests)
+
+**Stripe webhook event handling:**
+- File: `app/api/webhooks/stripe/route.ts`
+- Why fragile: Giant switch statement with 12 event types, shared transaction logic
+- Common failures: New event type added without handling, partial DB updates on error
+- Safe modification: Extract each event handler to `lib/stripe/handlers/*.ts`
+- Test coverage: Only 3 of 12 event types have tests
+
+## Scaling Limits
+
+**Supabase Free Tier:**
+- Current capacity: 500MB database, 1GB file storage, 2GB bandwidth/month
+- Limit: ~5000 users estimated before hitting limits
+- Symptoms at limit: 429 rate limit errors, DB writes fail
+- Scaling path: Upgrade to Pro ($25/mo) extends to 8GB DB, 100GB storage
+
+**Server-side render blocking:**
+- Current capacity: ~50 concurrent users before slowdown
+- Limit: Vercel Hobby plan (10s function timeout, 100GB-hrs/mo)
+- Symptoms at limit: 504 gateway timeouts on course pages
+- Scaling path: Upgrade to Vercel Pro ($20/mo), add edge caching
+
+## Dependencies at Risk
+
+**react-hot-toast:**
+- Risk: Unmaintained (last update 18 months ago), React 19 compatibility unknown
+- Impact: Toast notifications break, no graceful degradation
+- Migration plan: Switch to sonner (actively maintained, similar API)
+
+## Missing Critical Features
+
+**Payment failure handling:**
+- Problem: No retry mechanism or user notification when subscription payment fails
+- Current workaround: Users manually re-enter payment info (if they notice)
+- Blocks: Can't retain users with expired cards, no dunning process
+- Implementation complexity: Medium (Stripe webhooks + email flow + UI)
+
+**Course progress tracking:**
+- Problem: No persistent state for which lessons completed
+- Current workaround: Users manually track progress
+- Blocks: Can't show completion percentage, can't recommend next lesson
+- Implementation complexity: Low (add completed_lessons junction table)
+
+## Test Coverage Gaps
+
+**Payment flow end-to-end:**
+- What's not tested: Full Stripe checkout -> webhook -> subscription activation flow
+- Risk: Payment processing could break silently (has happened twice)
+- Priority: High
+- Difficulty to test: Need Stripe test fixtures and webhook simulation setup
+
+**Error boundary behavior:**
+- What's not tested: How app behaves when components throw errors
+- Risk: White screen of death for users, no error reporting
+- Priority: Medium
+- Difficulty to test: Need to intentionally trigger errors in test environment
+
+---
+
+*Concerns audit: 2025-01-20*
+*Update as issues are fixed or new ones discovered*
+```
+</good_examples>
+
+<guidelines>
+**What belongs in CONCERNS.md:**
+- Tech debt with clear impact and fix approach
+- Known bugs with reproduction steps
+- Security gaps and mitigation recommendations
+- Performance bottlenecks with measurements
+- Fragile code that breaks easily
+- Scaling limits with numbers
+- Dependencies that need attention
+- Missing features that block workflows
+- Test coverage gaps
+
+**What does NOT belong here:**
+- Opinions without evidence ("code is messy")
+- Complaints without solutions ("auth sucks")
+- Future feature ideas (that's for product planning)
+- Normal TODOs (those live in code comments)
+- Architectural decisions that are working fine
+- Minor code style issues
+
+**When filling this template:**
+- **Always include file paths** - Concerns without locations are not actionable. Use backticks: `src/file.ts`
+- Be specific with measurements ("500ms p95" not "slow")
+- Include reproduction steps for bugs
+- Suggest fix approaches, not just problems
+- Focus on actionable items
+- Prioritize by risk/impact
+- Update as issues get resolved
+- Add new concerns as discovered
+
+**Tone guidelines:**
+- Professional, not emotional ("N+1 query pattern" not "terrible queries")
+- Solution-oriented ("Fix: add index" not "needs fixing")
+- Risk-focused ("Could expose user data" not "security is bad")
+- Factual ("3.5s load time" not "really slow")
+
+**Useful for phase planning when:**
+- Deciding what to work on next
+- Estimating risk of changes
+- Understanding where to be careful
+- Prioritizing improvements
+- Onboarding new the agent contexts
+- Planning refactoring work
+
+**How this gets populated:**
+Explore agents detect these during codebase mapping. Manual additions welcome for human-discovered issues. This is living documentation, not a complaint list.
+</guidelines>
--- a/.pi/gsd/templates/codebase/conventions.md
+++ b/.pi/gsd/templates/codebase/conventions.md
@@ -0,0 +1,307 @@
+# Coding Conventions Template
+
+Template for `.planning/codebase/CONVENTIONS.md` - captures coding style and patterns.
+
+**Purpose:** Document how code is written in this codebase. Prescriptive guide for the agent to match existing style.
+
+---
+
+## File Template
+
+```markdown
+# Coding Conventions
+
+**Analysis Date:** [YYYY-MM-DD]
+
+## Naming Patterns
+
+**Files:**
+- [Pattern: e.g., "kebab-case for all files"]
+- [Test files: e.g., "*.test.ts alongside source"]
+- [Components: e.g., "PascalCase.tsx for React components"]
+
+**Functions:**
+- [Pattern: e.g., "camelCase for all functions"]
+- [Async: e.g., "no special prefix for async functions"]
+- [Handlers: e.g., "handleEventName for event handlers"]
+
+**Variables:**
+- [Pattern: e.g., "camelCase for variables"]
+- [Constants: e.g., "UPPER_SNAKE_CASE for constants"]
+- [Private: e.g., "_prefix for private members" or "no prefix"]
+
+**Types:**
+- [Interfaces: e.g., "PascalCase, no I prefix"]
+- [Types: e.g., "PascalCase for type aliases"]
+- [Enums: e.g., "PascalCase for enum name, UPPER_CASE for values"]
+
+## Code Style
+
+**Formatting:**
+- [Tool: e.g., "Prettier with config in .prettierrc"]
+- [Line length: e.g., "100 characters max"]
+- [Quotes: e.g., "single quotes for strings"]
+- [Semicolons: e.g., "required" or "omitted"]
+
+**Linting:**
+- [Tool: e.g., "ESLint with eslint.config.js"]
+- [Rules: e.g., "extends airbnb-base, no console in production"]
+- [Run: e.g., "npm run lint"]
+
+## Import Organization
+
+**Order:**
+1. [e.g., "External packages (react, express, etc.)"]
+2. [e.g., "Internal modules (@/lib, @/components)"]
+3. [e.g., "Relative imports (., ..)"]
+4. [e.g., "Type imports (import type {})"]
+
+**Grouping:**
+- [Blank lines: e.g., "blank line between groups"]
+- [Sorting: e.g., "alphabetical within each group"]
+
+**Path Aliases:**
+- [Aliases used: e.g., "@/ for src/, @components/ for src/components/"]
+
+## Error Handling
+
+**Patterns:**
+- [Strategy: e.g., "throw errors, catch at boundaries"]
+- [Custom errors: e.g., "extend Error class, named *Error"]
+- [Async: e.g., "use try/catch, no .catch() chains"]
+
+**Error Types:**
+- [When to throw: e.g., "invalid input, missing dependencies"]
+- [When to return: e.g., "expected failures return Result<T, E>"]
+- [Logging: e.g., "log error with context before throwing"]
+
+## Logging
+
+**Framework:**
+- [Tool: e.g., "console.log, pino, winston"]
+- [Levels: e.g., "debug, info, warn, error"]
+
+**Patterns:**
+- [Format: e.g., "structured logging with context object"]
+- [When: e.g., "log state transitions, external calls"]
+- [Where: e.g., "log at service boundaries, not in utils"]
+
+## Comments
+
+**When to Comment:**
+- [e.g., "explain why, not what"]
+- [e.g., "document business logic, algorithms, edge cases"]
+- [e.g., "avoid obvious comments like // increment counter"]
+
+**JSDoc/TSDoc:**
+- [Usage: e.g., "required for public APIs, optional for internal"]
+- [Format: e.g., "use @param, @returns, @throws tags"]
+
+**TODO Comments:**
+- [Pattern: e.g., "// TODO(username): description"]
+- [Tracking: e.g., "link to issue number if available"]
+
+## Function Design
+
+**Size:**
+- [e.g., "keep under 50 lines, extract helpers"]
+
+**Parameters:**
+- [e.g., "max 3 parameters, use object for more"]
+- [e.g., "destructure objects in parameter list"]
+
+**Return Values:**
+- [e.g., "explicit returns, no implicit undefined"]
+- [e.g., "return early for guard clauses"]
+
+## Module Design
+
+**Exports:**
+- [e.g., "named exports preferred, default exports for React components"]
+- [e.g., "export from index.ts for public API"]
+
+**Barrel Files:**
+- [e.g., "use index.ts to re-export public API"]
+- [e.g., "avoid circular dependencies"]
+
+---
+
+*Convention analysis: [date]*
+*Update when patterns change*
+```
+
+<good_examples>
+```markdown
+# Coding Conventions
+
+**Analysis Date:** 2025-01-20
+
+## Naming Patterns
+
+**Files:**
+- kebab-case for all files (command-handler.ts, user-service.ts)
+- *.test.ts alongside source files
+- index.ts for barrel exports
+
+**Functions:**
+- camelCase for all functions
+- No special prefix for async functions
+- handleEventName for event handlers (handleClick, handleSubmit)
+
+**Variables:**
+- camelCase for variables
+- UPPER_SNAKE_CASE for constants (MAX_RETRIES, API_BASE_URL)
+- No underscore prefix (no private marker in TS)
+
+**Types:**
+- PascalCase for interfaces, no I prefix (User, not IUser)
+- PascalCase for type aliases (UserConfig, ResponseData)
+- PascalCase for enum names, UPPER_CASE for values (Status.PENDING)
+
+## Code Style
+
+**Formatting:**
+- Prettier with .prettierrc
+- 100 character line length
+- Single quotes for strings
+- Semicolons required
+- 2 space indentation
+
+**Linting:**
+- ESLint with eslint.config.js
+- Extends @typescript-eslint/recommended
+- No console.log in production code (use logger)
+- Run: npm run lint
+
+## Import Organization
+
+**Order:**
+1. External packages (react, express, commander)
+2. Internal modules (@/lib, @/services)
+3. Relative imports (./utils, ../types)
+4. Type imports (import type { User })
+
+**Grouping:**
+- Blank line between groups
+- Alphabetical within each group
+- Type imports last within each group
+
+**Path Aliases:**
+- @/ maps to src/
+- No other aliases defined
+
+## Error Handling
+
+**Patterns:**
+- Throw errors, catch at boundaries (route handlers, main functions)
+- Extend Error class for custom errors (ValidationError, NotFoundError)
+- Async functions use try/catch, no .catch() chains
+
+**Error Types:**
+- Throw on invalid input, missing dependencies, invariant violations
+- Log error with context before throwing: logger.error({ err, userId }, 'Failed to process')
+- Include cause in error message: new Error('Failed to X', { cause: originalError })
+
+## Logging
+
+**Framework:**
+- pino logger instance exported from lib/logger.ts
+- Levels: debug, info, warn, error (no trace)
+
+**Patterns:**
+- Structured logging with context: logger.info({ userId, action }, 'User action')
+- Log at service boundaries, not in utility functions
+- Log state transitions, external API calls, errors
+- No console.log in committed code
+
+## Comments
+
+**When to Comment:**
+- Explain why, not what: // Retry 3 times because API has transient failures
+- Document business rules: // Users must verify email within 24 hours
+- Explain non-obvious algorithms or workarounds
+- Avoid obvious comments: // set count to 0
+
+**JSDoc/TSDoc:**
+- Required for public API functions
+- Optional for internal functions if signature is self-explanatory
+- Use @param, @returns, @throws tags
+
+**TODO Comments:**
+- Format: // TODO: description (no username, using git blame)
+- Link to issue if exists: // TODO: Fix race condition (issue #123)
+
+## Function Design
+
+**Size:**
+- Keep under 50 lines
+- Extract helpers for complex logic
+- One level of abstraction per function
+
+**Parameters:**
+- Max 3 parameters
+- Use options object for 4+ parameters: function create(options: CreateOptions)
+- Destructure in parameter list: function process({ id, name }: ProcessParams)
+
+**Return Values:**
+- Explicit return statements
+- Return early for guard clauses
+- Use Result<T, E> type for expected failures
+
+## Module Design
+
+**Exports:**
+- Named exports preferred
+- Default exports only for React components
+- Export public API from index.ts barrel files
+
+**Barrel Files:**
+- index.ts re-exports public API
+- Keep internal helpers private (don't export from index)
+- Avoid circular dependencies (import from specific files if needed)
+
+---
+
+*Convention analysis: 2025-01-20*
+*Update when patterns change*
+```
+</good_examples>
+
+<guidelines>
+**What belongs in CONVENTIONS.md:**
+- Naming patterns observed in the codebase
+- Formatting rules (Prettier config, linting rules)
+- Import organization patterns
+- Error handling strategy
+- Logging approach
+- Comment conventions
+- Function and module design patterns
+
+**What does NOT belong here:**
+- Architecture decisions (that's ARCHITECTURE.md)
+- Technology choices (that's STACK.md)
+- Test patterns (that's TESTING.md)
+- File organization (that's STRUCTURE.md)
+
+**When filling this template:**
+- Check .prettierrc, .eslintrc, or similar config files
+- Examine 5-10 representative source files for patterns
+- Look for consistency: if 80%+ follows a pattern, document it
+- Be prescriptive: "Use X" not "Sometimes Y is used"
+- Note deviations: "Legacy code uses Y, new code should use X"
+- Keep under ~150 lines total
+
+**Useful for phase planning when:**
+- Writing new code (match existing style)
+- Adding features (follow naming patterns)
+- Refactoring (apply consistent conventions)
+- Code review (check against documented patterns)
+- Onboarding (understand style expectations)
+
+**Analysis approach:**
+- Scan src/ directory for file naming patterns
+- Check package.json scripts for lint/format commands
+- Read 5-10 files to identify function naming, error handling
+- Look for config files (.prettierrc, eslint.config.js)
+- Note patterns in imports, comments, function signatures
+</guidelines>
--- a/.pi/gsd/templates/codebase/integrations.md
+++ b/.pi/gsd/templates/codebase/integrations.md
@@ -0,0 +1,280 @@
+# External Integrations Template
+
+Template for `.planning/codebase/INTEGRATIONS.md` - captures external service dependencies.
+
+**Purpose:** Document what external systems this codebase communicates with. Focused on "what lives outside our code that we depend on."
+
+---
+
+## File Template
+
+```markdown
+# External Integrations
+
+**Analysis Date:** [YYYY-MM-DD]
+
+## APIs & External Services
+
+**Payment Processing:**
+- [Service] - [What it's used for: e.g., "subscription billing, one-time payments"]
+  - SDK/Client: [e.g., "stripe npm package v14.x"]
+  - Auth: [e.g., "API key in STRIPE_SECRET_KEY env var"]
+  - Endpoints used: [e.g., "checkout sessions, webhooks"]
+
+**Email/SMS:**
+- [Service] - [What it's used for: e.g., "transactional emails"]
+  - SDK/Client: [e.g., "sendgrid/mail v8.x"]
+  - Auth: [e.g., "API key in SENDGRID_API_KEY env var"]
+  - Templates: [e.g., "managed in SendGrid dashboard"]
+
+**External APIs:**
+- [Service] - [What it's used for]
+  - Integration method: [e.g., "REST API via fetch", "GraphQL client"]
+  - Auth: [e.g., "OAuth2 token in AUTH_TOKEN env var"]
+  - Rate limits: [if applicable]
+
+## Data Storage
+
+**Databases:**
+- [Type/Provider] - [e.g., "PostgreSQL on Supabase"]
+  - Connection: [e.g., "via DATABASE_URL env var"]
+  - Client: [e.g., "Prisma ORM v5.x"]
+  - Migrations: [e.g., "prisma migrate in migrations/"]
+
+**File Storage:**
+- [Service] - [e.g., "AWS S3 for user uploads"]
+  - SDK/Client: [e.g., "@aws-sdk/client-s3"]
+  - Auth: [e.g., "IAM credentials in AWS_* env vars"]
+  - Buckets: [e.g., "prod-uploads, dev-uploads"]
+
+**Caching:**
+- [Service] - [e.g., "Redis for session storage"]
+  - Connection: [e.g., "REDIS_URL env var"]
+  - Client: [e.g., "ioredis v5.x"]
+
+## Authentication & Identity
+
+**Auth Provider:**
+- [Service] - [e.g., "Supabase Auth", "Auth0", "custom JWT"]
+  - Implementation: [e.g., "Supabase client SDK"]
+  - Token storage: [e.g., "httpOnly cookies", "localStorage"]
+  - Session management: [e.g., "JWT refresh tokens"]
+
+**OAuth Integrations:**
+- [Provider] - [e.g., "Google OAuth for sign-in"]
+  - Credentials: [e.g., "GOOGLE_CLIENT_ID, GOOGLE_CLIENT_SECRET"]
+  - Scopes: [e.g., "email, profile"]
+
+## Monitoring & Observability
+
+**Error Tracking:**
+- [Service] - [e.g., "Sentry"]
+  - DSN: [e.g., "SENTRY_DSN env var"]
+  - Release tracking: [e.g., "via SENTRY_RELEASE"]
+
+**Analytics:**
+- [Service] - [e.g., "Mixpanel for product analytics"]
+  - Token: [e.g., "MIXPANEL_TOKEN env var"]
+  - Events tracked: [e.g., "user actions, page views"]
+
+**Logs:**
+- [Service] - [e.g., "CloudWatch", "Datadog", "none (stdout only)"]
+  - Integration: [e.g., "AWS Lambda built-in"]
+
+## CI/CD & Deployment
+
+**Hosting:**
+- [Platform] - [e.g., "Vercel", "AWS Lambda", "Docker on ECS"]
+  - Deployment: [e.g., "automatic on main branch push"]
+  - Environment vars: [e.g., "configured in Vercel dashboard"]
+
+**CI Pipeline:**
+- [Service] - [e.g., "GitHub Actions"]
+  - Workflows: [e.g., "test.yml, deploy.yml"]
+  - Secrets: [e.g., "stored in GitHub repo secrets"]
+
+## Environment Configuration
+
+**Development:**
+- Required env vars: [List critical vars]
+- Secrets location: [e.g., ".env.local (gitignored)", "1Password vault"]
+- Mock/stub services: [e.g., "Stripe test mode", "local PostgreSQL"]
+
+**Staging:**
+- Environment-specific differences: [e.g., "uses staging Stripe account"]
+- Data: [e.g., "separate staging database"]
+
+**Production:**
+- Secrets management: [e.g., "Vercel environment variables"]
+- Failover/redundancy: [e.g., "multi-region DB replication"]
+
+## Webhooks & Callbacks
+
+**Incoming:**
+- [Service] - [Endpoint: e.g., "/api/webhooks/stripe"]
+  - Verification: [e.g., "signature validation via stripe.webhooks.constructEvent"]
+  - Events: [e.g., "payment_intent.succeeded, customer.subscription.updated"]
+
+**Outgoing:**
+- [Service] - [What triggers it]
+  - Endpoint: [e.g., "external CRM webhook on user signup"]
+  - Retry logic: [if applicable]
+
+---
+
+*Integration audit: [date]*
+*Update when adding/removing external services*
+```
+
+<good_examples>
+```markdown
+# External Integrations
+
+**Analysis Date:** 2025-01-20
+
+## APIs & External Services
+
+**Payment Processing:**
+- Stripe - Subscription billing and one-time course payments
+  - SDK/Client: stripe npm package v14.8
+  - Auth: API key in STRIPE_SECRET_KEY env var
+  - Endpoints used: checkout sessions, customer portal, webhooks
+
+**Email/SMS:**
+- SendGrid - Transactional emails (receipts, password resets)
+  - SDK/Client: @sendgrid/mail v8.1
+  - Auth: API key in SENDGRID_API_KEY env var
+  - Templates: Managed in SendGrid dashboard (template IDs in code)
+
+**External APIs:**
+- OpenAI API - Course content generation
+  - Integration method: REST API via openai npm package v4.x
+  - Auth: Bearer token in OPENAI_API_KEY env var
+  - Rate limits: 3500 requests/min (tier 3)
+
+## Data Storage
+
+**Databases:**
+- PostgreSQL on Supabase - Primary data store
+  - Connection: via DATABASE_URL env var
+  - Client: Prisma ORM v5.8
+  - Migrations: prisma migrate in prisma/migrations/
+
+**File Storage:**
+- Supabase Storage - User uploads (profile images, course materials)
+  - SDK/Client: @supabase/supabase-js v2.x
+  - Auth: Service role key in SUPABASE_SERVICE_ROLE_KEY
+  - Buckets: avatars (public), course-materials (private)
+
+**Caching:**
+- None currently (all database queries, no Redis)
+
+## Authentication & Identity
+
+**Auth Provider:**
+- Supabase Auth - Email/password + OAuth
+  - Implementation: Supabase client SDK with server-side session management
+  - Token storage: httpOnly cookies via @supabase/ssr
+  - Session management: JWT refresh tokens handled by Supabase
+
+**OAuth Integrations:**
+- Google OAuth - Social sign-in
+  - Credentials: GOOGLE_CLIENT_ID, GOOGLE_CLIENT_SECRET (Supabase dashboard)
+  - Scopes: email, profile
+
+## Monitoring & Observability
+
+**Error Tracking:**
+- Sentry - Server and client errors
+  - DSN: SENTRY_DSN env var
+  - Release tracking: Git commit SHA via SENTRY_RELEASE
+
+**Analytics:**
+- None (planned: Mixpanel)
+
+**Logs:**
+- Vercel logs - stdout/stderr only
+  - Retention: 7 days on Pro plan
+
+## CI/CD & Deployment
+
+**Hosting:**
+- Vercel - Next.js app hosting
+  - Deployment: Automatic on main branch push
+  - Environment vars: Configured in Vercel dashboard (synced to .env.example)
+
+**CI Pipeline:**
+- GitHub Actions - Tests and type checking
+  - Workflows: .github/workflows/ci.yml
+  - Secrets: None needed (public repo tests only)
+
+## Environment Configuration
+
+**Development:**
+- Required env vars: DATABASE_URL, NEXT_PUBLIC_SUPABASE_URL, NEXT_PUBLIC_SUPABASE_ANON_KEY
+- Secrets location: .env.local (gitignored), team shared via 1Password vault
+- Mock/stub services: Stripe test mode, Supabase local dev project
+
+**Staging:**
+- Uses separate Supabase staging project
+- Stripe test mode
+- Same Vercel account, different environment
+
+**Production:**
+- Secrets management: Vercel environment variables
+- Database: Supabase production project with daily backups
+
+## Webhooks & Callbacks
+
+**Incoming:**
+- Stripe - /api/webhooks/stripe
+  - Verification: Signature validation via stripe.webhooks.constructEvent
+  - Events: payment_intent.succeeded, customer.subscription.updated, customer.subscription.deleted
+
+**Outgoing:**
+- None
+
+---
+
+*Integration audit: 2025-01-20*
+*Update when adding/removing external services*
+```
+</good_examples>
+
+<guidelines>
+**What belongs in INTEGRATIONS.md:**
+- External services the code communicates with
+- Authentication patterns (where secrets live, not the secrets themselves)
+- SDKs and client libraries used
+- Environment variable names (not values)
+- Webhook endpoints and verification methods
+- Database connection patterns
+- File storage locations
+- Monitoring and logging services
+
+**What does NOT belong here:**
+- Actual API keys or secrets (NEVER write these)
+- Internal architecture (that's ARCHITECTURE.md)
+- Code patterns (that's PATTERNS.md)
+- Technology choices (that's STACK.md)
+- Performance issues (that's CONCERNS.md)
+
+**When filling this template:**
+- Check .env.example or .env.template for required env vars
+- Look for SDK imports (stripe, @sendgrid/mail, etc.)
+- Check for webhook handlers in routes/endpoints
+- Note where secrets are managed (not the secrets)
+- Document environment-specific differences (dev/staging/prod)
+- Include auth patterns for each service
+
+**Useful for phase planning when:**
+- Adding new external service integrations
+- Debugging authentication issues
+- Understanding data flow outside the application
+- Setting up new environments
+- Auditing third-party dependencies
+- Planning for service outages or migrations
+
+**Security note:**
+Document WHERE secrets live (env vars, Vercel dashboard, 1Password), never WHAT the secrets are.
+</guidelines>
--- a/.pi/gsd/templates/codebase/stack.md
+++ b/.pi/gsd/templates/codebase/stack.md
@@ -0,0 +1,186 @@
+# Technology Stack Template
+
+Template for `.planning/codebase/STACK.md` - captures the technology foundation.
+
+**Purpose:** Document what technologies run this codebase. Focused on "what executes when you run the code."
+
+---
+
+## File Template
+
+```markdown
+# Technology Stack
+
+**Analysis Date:** [YYYY-MM-DD]
+
+## Languages
+
+**Primary:**
+- [Language] [Version] - [Where used: e.g., "all application code"]
+
+**Secondary:**
+- [Language] [Version] - [Where used: e.g., "build scripts, tooling"]
+
+## Runtime
+
+**Environment:**
+- [Runtime] [Version] - [e.g., "Node.js 20.x"]
+- [Additional requirements if any]
+
+**Package Manager:**
+- [Manager] [Version] - [e.g., "npm 10.x"]
+- Lockfile: [e.g., "package-lock.json present"]
+
+## Frameworks
+
+**Core:**
+- [Framework] [Version] - [Purpose: e.g., "web server", "UI framework"]
+
+**Testing:**
+- [Framework] [Version] - [e.g., "Jest for unit tests"]
+- [Framework] [Version] - [e.g., "Playwright for E2E"]
+
+**Build/Dev:**
+- [Tool] [Version] - [e.g., "Vite for bundling"]
+- [Tool] [Version] - [e.g., "TypeScript compiler"]
+
+## Key Dependencies
+
+[Only include dependencies critical to understanding the stack - limit to 5-10 most important]
+
+**Critical:**
+- [Package] [Version] - [Why it matters: e.g., "authentication", "database access"]
+- [Package] [Version] - [Why it matters]
+
+**Infrastructure:**
+- [Package] [Version] - [e.g., "Express for HTTP routing"]
+- [Package] [Version] - [e.g., "PostgreSQL client"]
+
+## Configuration
+
+**Environment:**
+- [How configured: e.g., ".env files", "environment variables"]
+- [Key configs: e.g., "DATABASE_URL, API_KEY required"]
+
+**Build:**
+- [Build config files: e.g., "vite.config.ts, tsconfig.json"]
+
+## Platform Requirements
+
+**Development:**
+- [OS requirements or "any platform"]
+- [Additional tooling: e.g., "Docker for local DB"]
+
+**Production:**
+- [Deployment target: e.g., "Vercel", "AWS Lambda", "Docker container"]
+- [Version requirements]
+
+---
+
+*Stack analysis: [date]*
+*Update after major dependency changes*
+```
+
+<good_examples>
+```markdown
+# Technology Stack
+
+**Analysis Date:** 2025-01-20
+
+## Languages
+
+**Primary:**
+- TypeScript 5.3 - All application code
+
+**Secondary:**
+- JavaScript - Build scripts, config files
+
+## Runtime
+
+**Environment:**
+- Node.js 20.x (LTS)
+- No browser runtime (CLI tool only)
+
+**Package Manager:**
+- npm 10.x
+- Lockfile: `package-lock.json` present
+
+## Frameworks
+
+**Core:**
+- None (vanilla Node.js CLI)
+
+**Testing:**
+- Vitest 1.0 - Unit tests
+- tsx - TypeScript execution without build step
+
+**Build/Dev:**
+- TypeScript 5.3 - Compilation to JavaScript
+- esbuild - Used by Vitest for fast transforms
+
+## Key Dependencies
+
+**Critical:**
+- commander 11.x - CLI argument parsing and command structure
+- chalk 5.x - Terminal output styling
+- fs-extra 11.x - Extended file system operations
+
+**Infrastructure:**
+- Node.js built-ins - fs, path, child_process for file operations
+
+## Configuration
+
+**Environment:**
+- No environment variables required
+- Configuration via CLI flags only
+
+**Build:**
+- `tsconfig.json` - TypeScript compiler options
+- `vitest.config.ts` - Test runner configuration
+
+## Platform Requirements
+
+**Development:**
+- macOS/Linux/Windows (any platform with Node.js)
+- No external dependencies
+
+**Production:**
+- Distributed as npm package
+- Installed globally via npm install -g
+- Runs on user's Node.js installation
+
+---
+
+*Stack analysis: 2025-01-20*
+*Update after major dependency changes*
+```
+</good_examples>
+
+<guidelines>
+**What belongs in STACK.md:**
+- Languages and versions
+- Runtime requirements (Node, Bun, Deno, browser)
+- Package manager and lockfile
+- Framework choices
+- Critical dependencies (limit to 5-10 most important)
+- Build tooling
+- Platform/deployment requirements
+
+**What does NOT belong here:**
+- File structure (that's STRUCTURE.md)
+- Architectural patterns (that's ARCHITECTURE.md)
+- Every dependency in package.json (only critical ones)
+- Implementation details (defer to code)
+
+**When filling this template:**
+- Check package.json for dependencies
+- Note runtime version from .nvmrc or package.json engines
+- Include only dependencies that affect understanding (not every utility)
+- Specify versions only when version matters (breaking changes, compatibility)
+
+**Useful for phase planning when:**
+- Adding new dependencies (check compatibility)
+- Upgrading frameworks (know what's in use)
+- Choosing implementation approach (must work with existing stack)
+- Understanding build requirements
+</guidelines>
--- a/.pi/gsd/templates/codebase/structure.md
+++ b/.pi/gsd/templates/codebase/structure.md
@@ -0,0 +1,285 @@
+# Structure Template
+
+Template for `.planning/codebase/STRUCTURE.md` - captures physical file organization.
+
+**Purpose:** Document where things physically live in the codebase. Answers "where do I put X?"
+
+---
+
+## File Template
+
+```markdown
+# Codebase Structure
+
+**Analysis Date:** [YYYY-MM-DD]
+
+## Directory Layout
+
+[ASCII box-drawing tree of top-level directories with purpose - use ├── └── │ characters for tree structure only]
+
+```
+[project-root]/
+├── [dir]/          # [Purpose]
+├── [dir]/          # [Purpose]
+├── [dir]/          # [Purpose]
+└── [file]          # [Purpose]
+```
+
+## Directory Purposes
+
+**[Directory Name]:**
+- Purpose: [What lives here]
+- Contains: [Types of files: e.g., "*.ts source files", "component directories"]
+- Key files: [Important files in this directory]
+- Subdirectories: [If nested, describe structure]
+
+**[Directory Name]:**
+- Purpose: [What lives here]
+- Contains: [Types of files]
+- Key files: [Important files]
+- Subdirectories: [Structure]
+
+## Key File Locations
+
+**Entry Points:**
+- [Path]: [Purpose: e.g., "CLI entry point"]
+- [Path]: [Purpose: e.g., "Server startup"]
+
+**Configuration:**
+- [Path]: [Purpose: e.g., "TypeScript config"]
+- [Path]: [Purpose: e.g., "Build configuration"]
+- [Path]: [Purpose: e.g., "Environment variables"]
+
+**Core Logic:**
+- [Path]: [Purpose: e.g., "Business services"]
+- [Path]: [Purpose: e.g., "Database models"]
+- [Path]: [Purpose: e.g., "API routes"]
+
+**Testing:**
+- [Path]: [Purpose: e.g., "Unit tests"]
+- [Path]: [Purpose: e.g., "Test fixtures"]
+
+**Documentation:**
+- [Path]: [Purpose: e.g., "User-facing docs"]
+- [Path]: [Purpose: e.g., "Developer guide"]
+
+## Naming Conventions
+
+**Files:**
+- [Pattern]: [Example: e.g., "kebab-case.ts for modules"]
+- [Pattern]: [Example: e.g., "PascalCase.tsx for React components"]
+- [Pattern]: [Example: e.g., "*.test.ts for test files"]
+
+**Directories:**
+- [Pattern]: [Example: e.g., "kebab-case for feature directories"]
+- [Pattern]: [Example: e.g., "plural names for collections"]
+
+**Special Patterns:**
+- [Pattern]: [Example: e.g., "index.ts for directory exports"]
+- [Pattern]: [Example: e.g., "__tests__ for test directories"]
+
+## Where to Add New Code
+
+**New Feature:**
+- Primary code: [Directory path]
+- Tests: [Directory path]
+- Config if needed: [Directory path]
+
+**New Component/Module:**
+- Implementation: [Directory path]
+- Types: [Directory path]
+- Tests: [Directory path]
+
+**New Route/Command:**
+- Definition: [Directory path]
+- Handler: [Directory path]
+- Tests: [Directory path]
+
+**Utilities:**
+- Shared helpers: [Directory path]
+- Type definitions: [Directory path]
+
+## Special Directories
+
+[Any directories with special meaning or generation]
+
+**[Directory]:**
+- Purpose: [e.g., "Generated code", "Build output"]
+- Source: [e.g., "Auto-generated by X", "Build artifacts"]
+- Committed: [Yes/No - in .gitignore?]
+
+---
+
+*Structure analysis: [date]*
+*Update when directory structure changes*
+```
+
+<good_examples>
+```markdown
+# Codebase Structure
+
+**Analysis Date:** 2025-01-20
+
+## Directory Layout
+
+```
+get-shit-done/
+├── bin/                # Executable entry points
+├── commands/           # Slash command definitions
+│   └── gsd/           # GSD-specific commands
+├── get-shit-done/     # Skill resources
+│   ├── references/    # Principle documents
+│   ├── templates/     # File templates
+│   └── workflows/     # Multi-step procedures
+├── src/               # Source code (if applicable)
+├── tests/             # Test files
+├── package.json       # Project manifest
+└── README.md          # User documentation
+```
+
+## Directory Purposes
+
+**bin/**
+- Purpose: CLI entry points
+- Contains: install.js (installer script)
+- Key files: install.js - handles npx installation
+- Subdirectories: None
+
+**commands/gsd/**
+- Purpose: Slash command definitions for Claude Code
+- Contains: *.md files (one per command)
+- Key files: new-project.md, plan-phase.md, execute-plan.md
+- Subdirectories: None (flat structure)
+
+**get-shit-done/references/**
+- Purpose: Core philosophy and guidance documents
+- Contains: principles.md, questioning.md, plan-format.md
+- Key files: principles.md - system philosophy
+- Subdirectories: None
+
+**get-shit-done/templates/**
+- Purpose: Document templates for .planning/ files
+- Contains: Template definitions with frontmatter
+- Key files: project.md, roadmap.md, plan.md, summary.md
+- Subdirectories: codebase/ (new - for stack/architecture/structure templates)
+
+**get-shit-done/workflows/**
+- Purpose: Reusable multi-step procedures
+- Contains: Workflow definitions called by commands
+- Key files: execute-plan.md, research-phase.md
+- Subdirectories: None
+
+## Key File Locations
+
+**Entry Points:**
+- `bin/install.js` - Installation script (npx entry)
+
+**Configuration:**
+- `package.json` - Project metadata, dependencies, bin entry
+- `.gitignore` - Excluded files
+
+**Core Logic:**
+- `bin/install.js` - All installation logic (file copying, path replacement)
+
+**Testing:**
+- `tests/` - Test files (if present)
+
+**Documentation:**
+- `README.md` - User-facing installation and usage guide
+- `GEMINI.md` - Instructions for Claude Code when working in this repo
+
+## Naming Conventions
+
+**Files:**
+- kebab-case.md: Markdown documents
+- kebab-case.js: JavaScript source files
+- UPPERCASE.md: Important project files (README, CLAUDE, CHANGELOG)
+
+**Directories:**
+- kebab-case: All directories
+- Plural for collections: templates/, commands/, workflows/
+
+**Special Patterns:**
+- {command-name}.md: Slash command definition
+- *-template.md: Could be used but templates/ directory preferred
+
+## Where to Add New Code
+
+**New Slash Command:**
+- Primary code: `commands/gsd/{command-name}.md`
+- Tests: `tests/commands/{command-name}.test.js` (if testing implemented)
+- Documentation: Update `README.md` with new command
+
+**New Template:**
+- Implementation: `get-shit-done/templates/{name}.md`
+- Documentation: Template is self-documenting (includes guidelines)
+
+**New Workflow:**
+- Implementation: `get-shit-done/workflows/{name}.md`
+- Usage: Reference from command with `@.pi/gsd/workflows/{name}.md`
+
+**New Reference Document:**
+- Implementation: `get-shit-done/references/{name}.md`
+- Usage: Reference from commands/workflows as needed
+
+**Utilities:**
+- No utilities yet (`install.js` is monolithic)
+- If extracted: `src/utils/`
+
+## Special Directories
+
+**get-shit-done/**
+- Purpose: Resources installed to .agent/
+- Source: Copied by bin/install.js during installation
+- Committed: Yes (source of truth)
+
+**commands/**
+- Purpose: Slash commands installed to .agent/commands/
+- Source: Copied by bin/install.js during installation
+- Committed: Yes (source of truth)
+
+---
+
+*Structure analysis: 2025-01-20*
+*Update when directory structure changes*
+```
+</good_examples>
+
+<guidelines>
+**What belongs in STRUCTURE.md:**
+- Directory layout (ASCII box-drawing tree for structure visualization)
+- Purpose of each directory
+- Key file locations (entry points, configs, core logic)
+- Naming conventions
+- Where to add new code (by type)
+- Special/generated directories
+
+**What does NOT belong here:**
+- Conceptual architecture (that's ARCHITECTURE.md)
+- Technology stack (that's STACK.md)
+- Code implementation details (defer to code reading)
+- Every single file (focus on directories and key files)
+
+**When filling this template:**
+- Use `tree -L 2` or similar to visualize structure
+- Identify top-level directories and their purposes
+- Note naming patterns by observing existing files
+- Locate entry points, configs, and main logic areas
+- Keep directory tree concise (max 2-3 levels)
+
+**Tree format (ASCII box-drawing characters for structure only):**
+```
+root/
+├── dir1/           # Purpose
+│   ├── subdir/    # Purpose
+│   └── file.ts    # Purpose
+├── dir2/          # Purpose
+└── file.ts        # Purpose
+```
+
+**Useful for phase planning when:**
+- Adding new features (where should files go?)
+- Understanding project organization
+- Finding where specific logic lives
+- Following existing conventions
+</guidelines>
--- a/.pi/gsd/templates/codebase/testing.md
+++ b/.pi/gsd/templates/codebase/testing.md
@@ -0,0 +1,480 @@
+# Testing Patterns Template
+
+Template for `.planning/codebase/TESTING.md` - captures test framework and patterns.
+
+**Purpose:** Document how tests are written and run. Guide for adding tests that match existing patterns.
+
+---
+
+## File Template
+
+```markdown
+# Testing Patterns
+
+**Analysis Date:** [YYYY-MM-DD]
+
+## Test Framework
+
+**Runner:**
+- [Framework: e.g., "Jest 29.x", "Vitest 1.x"]
+- [Config: e.g., "jest.config.js in project root"]
+
+**Assertion Library:**
+- [Library: e.g., "built-in expect", "chai"]
+- [Matchers: e.g., "toBe, toEqual, toThrow"]
+
+**Run Commands:**
+```bash
+[e.g., "npm test" or "npm run test"]              # Run all tests
+[e.g., "npm test -- --watch"]                     # Watch mode
+[e.g., "npm test -- path/to/file.test.ts"]       # Single file
+[e.g., "npm run test:coverage"]                   # Coverage report
+```
+
+## Test File Organization
+
+**Location:**
+- [Pattern: e.g., "*.test.ts alongside source files"]
+- [Alternative: e.g., "__tests__/ directory" or "separate tests/ tree"]
+
+**Naming:**
+- [Unit tests: e.g., "module-name.test.ts"]
+- [Integration: e.g., "feature-name.integration.test.ts"]
+- [E2E: e.g., "user-flow.e2e.test.ts"]
+
+**Structure:**
+```
+[Show actual directory pattern, e.g.:
+src/
+  lib/
+    utils.ts
+    utils.test.ts
+  services/
+    user-service.ts
+    user-service.test.ts
+]
+```
+
+## Test Structure
+
+**Suite Organization:**
+```typescript
+[Show actual pattern used, e.g.:
+
+describe('ModuleName', () => {
+  describe('functionName', () => {
+    it('should handle success case', () => {
+      // arrange
+      // act
+      // assert
+    });
+
+    it('should handle error case', () => {
+      // test code
+    });
+  });
+});
+]
+```
+
+**Patterns:**
+- [Setup: e.g., "beforeEach for shared setup, avoid beforeAll"]
+- [Teardown: e.g., "afterEach to clean up, restore mocks"]
+- [Structure: e.g., "arrange/act/assert pattern required"]
+
+## Mocking
+
+**Framework:**
+- [Tool: e.g., "Jest built-in mocking", "Vitest vi", "Sinon"]
+- [Import mocking: e.g., "vi.mock() at top of file"]
+
+**Patterns:**
+```typescript
+[Show actual mocking pattern, e.g.:
+
+// Mock external dependency
+vi.mock('./external-service', () => ({
+  fetchData: vi.fn()
+}));
+
+// Mock in test
+const mockFetch = vi.mocked(fetchData);
+mockFetch.mockResolvedValue({ data: 'test' });
+]
+```
+
+**What to Mock:**
+- [e.g., "External APIs, file system, database"]
+- [e.g., "Time/dates (use vi.useFakeTimers)"]
+- [e.g., "Network calls (use mock fetch)"]
+
+**What NOT to Mock:**
+- [e.g., "Pure functions, utilities"]
+- [e.g., "Internal business logic"]
+
+## Fixtures and Factories
+
+**Test Data:**
+```typescript
+[Show pattern for creating test data, e.g.:
+
+// Factory pattern
+function createTestUser(overrides?: Partial<User>): User {
+  return {
+    id: 'test-id',
+    name: 'Test User',
+    email: 'test@example.com',
+    ...overrides
+  };
+}
+
+// Fixture file
+// tests/fixtures/users.ts
+export const mockUsers = [/* ... */];
+]
+```
+
+**Location:**
+- [e.g., "tests/fixtures/ for shared fixtures"]
+- [e.g., "factory functions in test file or tests/factories/"]
+
+## Coverage
+
+**Requirements:**
+- [Target: e.g., "80% line coverage", "no specific target"]
+- [Enforcement: e.g., "CI blocks <80%", "coverage for awareness only"]
+
+**Configuration:**
+- [Tool: e.g., "built-in coverage via --coverage flag"]
+- [Exclusions: e.g., "exclude *.test.ts, config files"]
+
+**View Coverage:**
+```bash
+[e.g., "npm run test:coverage"]
+[e.g., "open coverage/index.html"]
+```
+
+## Test Types
+
+**Unit Tests:**
+- [Scope: e.g., "test single function/class in isolation"]
+- [Mocking: e.g., "mock all external dependencies"]
+- [Speed: e.g., "must run in <1s per test"]
+
+**Integration Tests:**
+- [Scope: e.g., "test multiple modules together"]
+- [Mocking: e.g., "mock external services, use real internal modules"]
+- [Setup: e.g., "use test database, seed data"]
+
+**E2E Tests:**
+- [Framework: e.g., "Playwright for E2E"]
+- [Scope: e.g., "test full user flows"]
+- [Location: e.g., "e2e/ directory separate from unit tests"]
+
+## Common Patterns
+
+**Async Testing:**
+```typescript
+[Show pattern, e.g.:
+
+it('should handle async operation', async () => {
+  const result = await asyncFunction();
+  expect(result).toBe('expected');
+});
+]
+```
+
+**Error Testing:**
+```typescript
+[Show pattern, e.g.:
+
+it('should throw on invalid input', () => {
+  expect(() => functionCall()).toThrow('error message');
+});
+
+// Async error
+it('should reject on failure', async () => {
+  await expect(asyncCall()).rejects.toThrow('error message');
+});
+]
+```
+
+**Snapshot Testing:**
+- [Usage: e.g., "for React components only" or "not used"]
+- [Location: e.g., "__snapshots__/ directory"]
+
+---
+
+*Testing analysis: [date]*
+*Update when test patterns change*
+```
+
+<good_examples>
+```markdown
+# Testing Patterns
+
+**Analysis Date:** 2025-01-20
+
+## Test Framework
+
+**Runner:**
+- Vitest 1.0.4
+- Config: vitest.config.ts in project root
+
+**Assertion Library:**
+- Vitest built-in expect
+- Matchers: toBe, toEqual, toThrow, toMatchObject
+
+**Run Commands:**
+```bash
+npm test                              # Run all tests
+npm test -- --watch                   # Watch mode
+npm test -- path/to/file.test.ts     # Single file
+npm run test:coverage                 # Coverage report
+```
+
+## Test File Organization
+
+**Location:**
+- *.test.ts alongside source files
+- No separate tests/ directory
+
+**Naming:**
+- unit-name.test.ts for all tests
+- No distinction between unit/integration in filename
+
+**Structure:**
+```
+src/
+  lib/
+    parser.ts
+    parser.test.ts
+  services/
+    install-service.ts
+    install-service.test.ts
+  bin/
+    install.ts
+    (no test - integration tested via CLI)
+```
+
+## Test Structure
+
+**Suite Organization:**
+```typescript
+import { describe, it, expect, beforeEach, afterEach, vi } from 'vitest';
+
+describe('ModuleName', () => {
+  describe('functionName', () => {
+    beforeEach(() => {
+      // reset state
+    });
+
+    it('should handle valid input', () => {
+      // arrange
+      const input = createTestInput();
+
+      // act
+      const result = functionName(input);
+
+      // assert
+      expect(result).toEqual(expectedOutput);
+    });
+
+    it('should throw on invalid input', () => {
+      expect(() => functionName(null)).toThrow('Invalid input');
+    });
+  });
+});
+```
+
+**Patterns:**
+- Use beforeEach for per-test setup, avoid beforeAll
+- Use afterEach to restore mocks: vi.restoreAllMocks()
+- Explicit arrange/act/assert comments in complex tests
+- One assertion focus per test (but multiple expects OK)
+
+## Mocking
+
+**Framework:**
+- Vitest built-in mocking (vi)
+- Module mocking via vi.mock() at top of test file
+
+**Patterns:**
+```typescript
+import { vi } from 'vitest';
+import { externalFunction } from './external';
+
+// Mock module
+vi.mock('./external', () => ({
+  externalFunction: vi.fn()
+}));
+
+describe('test suite', () => {
+  it('mocks function', () => {
+    const mockFn = vi.mocked(externalFunction);
+    mockFn.mockReturnValue('mocked result');
+
+    // test code using mocked function
+
+    expect(mockFn).toHaveBeenCalledWith('expected arg');
+  });
+});
+```
+
+**What to Mock:**
+- File system operations (fs-extra)
+- Child process execution (child_process.exec)
+- External API calls
+- Environment variables (process.env)
+
+**What NOT to Mock:**
+- Internal pure functions
+- Simple utilities (string manipulation, array helpers)
+- TypeScript types
+
+## Fixtures and Factories
+
+**Test Data:**
+```typescript
+// Factory functions in test file
+function createTestConfig(overrides?: Partial<Config>): Config {
+  return {
+    targetDir: '/tmp/test',
+    global: false,
+    ...overrides
+  };
+}
+
+// Shared fixtures in tests/fixtures/
+// tests/fixtures/sample-command.md
+export const sampleCommand = `---
+description: Test command
+---
+Content here`;
+```
+
+**Location:**
+- Factory functions: define in test file near usage
+- Shared fixtures: tests/fixtures/ (for multi-file test data)
+- Mock data: inline in test when simple, factory when complex
+
+## Coverage
+
+**Requirements:**
+- No enforced coverage target
+- Coverage tracked for awareness
+- Focus on critical paths (parsers, service logic)
+
+**Configuration:**
+- Vitest coverage via c8 (built-in)
+- Excludes: *.test.ts, bin/install.ts, config files
+
+**View Coverage:**
+```bash
+npm run test:coverage
+open coverage/index.html
+```
+
+## Test Types
+
+**Unit Tests:**
+- Test single function in isolation
+- Mock all external dependencies (fs, child_process)
+- Fast: each test <100ms
+- Examples: parser.test.ts, validator.test.ts
+
+**Integration Tests:**
+- Test multiple modules together
+- Mock only external boundaries (file system, process)
+- Examples: install-service.test.ts (tests service + parser)
+
+**E2E Tests:**
+- Not currently used
+- CLI integration tested manually
+
+## Common Patterns
+
+**Async Testing:**
+```typescript
+it('should handle async operation', async () => {
+  const result = await asyncFunction();
+  expect(result).toBe('expected');
+});
+```
+
+**Error Testing:**
+```typescript
+it('should throw on invalid input', () => {
+  expect(() => parse(null)).toThrow('Cannot parse null');
+});
+
+// Async error
+it('should reject on file not found', async () => {
+  await expect(readConfig('invalid.txt')).rejects.toThrow('ENOENT');
+});
+```
+
+**File System Mocking:**
+```typescript
+import { vi } from 'vitest';
+import * as fs from 'fs-extra';
+
+vi.mock('fs-extra');
+
+it('mocks file system', () => {
+  vi.mocked(fs.readFile).mockResolvedValue('file content');
+  // test code
+});
+```
+
+**Snapshot Testing:**
+- Not used in this codebase
+- Prefer explicit assertions for clarity
+
+---
+
+*Testing analysis: 2025-01-20*
+*Update when test patterns change*
+```
+</good_examples>
+
+<guidelines>
+**What belongs in TESTING.md:**
+- Test framework and runner configuration
+- Test file location and naming patterns
+- Test structure (describe/it, beforeEach patterns)
+- Mocking approach and examples
+- Fixture/factory patterns
+- Coverage requirements
+- How to run tests (commands)
+- Common testing patterns in actual code
+
+**What does NOT belong here:**
+- Specific test cases (defer to actual test files)
+- Technology choices (that's STACK.md)
+- CI/CD setup (that's deployment docs)
+
+**When filling this template:**
+- Check package.json scripts for test commands
+- Find test config file (jest.config.js, vitest.config.ts)
+- Read 3-5 existing test files to identify patterns
+- Look for test utilities in tests/ or test-utils/
+- Check for coverage configuration
+- Document actual patterns used, not ideal patterns
+
+**Useful for phase planning when:**
+- Adding new features (write matching tests)
+- Refactoring (maintain test patterns)
+- Fixing bugs (add regression tests)
+- Understanding verification approach
+- Setting up test infrastructure
+
+**Analysis approach:**
+- Check package.json for test framework and scripts
+- Read test config file for coverage, setup
+- Examine test file organization (collocated vs separate)
+- Review 5 test files for patterns (mocking, structure, assertions)
+- Look for test utilities, fixtures, factories
+- Note any test types (unit, integration, e2e)
+- Document commands for running tests
+</guidelines>
--- a/.pi/gsd/templates/config.json
+++ b/.pi/gsd/templates/config.json
@@ -0,0 +1,44 @@
+{
+  "mode": "interactive",
+  "granularity": "standard",
+  "workflow": {
+    "research": true,
+    "plan_check": true,
+    "verifier": true,
+    "auto_advance": false,
+    "nyquist_validation": true,
+    "discuss_mode": "discuss",
+    "research_before_questions": false
+  },
+  "planning": {
+    "commit_docs": true,
+    "search_gitignored": false,
+    "sub_repos": []
+  },
+  "parallelization": {
+    "enabled": true,
+    "plan_level": true,
+    "task_level": false,
+    "skip_checkpoints": true,
+    "max_concurrent_agents": 3,
+    "min_plans_for_parallel": 2
+  },
+  "gates": {
+    "confirm_project": true,
+    "confirm_phases": true,
+    "confirm_roadmap": true,
+    "confirm_breakdown": true,
+    "confirm_plan": true,
+    "execute_next_plan": true,
+    "issues_review": true,
+    "confirm_transition": true
+  },
+  "safety": {
+    "always_confirm_destructive": true,
+    "always_confirm_external_services": true
+  },
+  "hooks": {
+    "context_warnings": true
+  },
+  "agent_skills": {}
+}
--- a/.pi/gsd/templates/context.md
+++ b/.pi/gsd/templates/context.md
@@ -0,0 +1,352 @@
+# Phase Context Template
+
+Template for `.planning/phases/XX-name/{phase_num}-CONTEXT.md` - captures implementation decisions for a phase.
+
+**Purpose:** Document decisions that downstream agents need. Researcher uses this to know WHAT to investigate. Planner uses this to know WHAT choices are locked vs flexible.
+
+**Key principle:** Categories are NOT predefined. They emerge from what was actually discussed for THIS phase. A CLI phase has CLI-relevant sections, a UI phase has UI-relevant sections.
+
+**Downstream consumers:**
+- `gsd-phase-researcher` - Reads decisions to focus research (e.g., "card layout" → research card component patterns)
+- `gsd-planner` - Reads decisions to create specific tasks (e.g., "infinite scroll" → task includes virtualization)
+
+---
+
+## File Template
+
+```markdown
+# Phase [X]: [Name] - Context
+
+**Gathered:** [date]
+**Status:** Ready for planning
+
+<domain>
+## Phase Boundary
+
+[Clear statement of what this phase delivers - the scope anchor. This comes from ROADMAP.md and is fixed. Discussion clarifies implementation within this boundary.]
+
+</domain>
+
+<decisions>
+## Implementation Decisions
+
+### [Area 1 that was discussed]
+- **D-01:** [Specific decision made]
+- **D-02:** [Another decision if applicable]
+
+### [Area 2 that was discussed]
+- **D-03:** [Specific decision made]
+
+### [Area 3 that was discussed]
+- **D-04:** [Specific decision made]
+
+### the agent's Discretion
+[Areas where user explicitly said "you decide" - the agent has flexibility here during planning/implementation]
+
+</decisions>
+
+<specifics>
+## Specific Ideas
+
+[Any particular references, examples, or "I want it like X" moments from discussion. Product references, specific behaviors, interaction patterns.]
+
+[If none: "No specific requirements - open to standard approaches"]
+
+</specifics>
+
+<canonical_refs>
+## Canonical References
+
+**Downstream agents MUST read these before planning or implementing.**
+
+[List every spec, ADR, feature doc, or design doc that defines requirements or constraints for this phase. Use full relative paths so agents can read them directly. Group by topic area when the phase has multiple concerns.]
+
+### [Topic area 1]
+- `path/to/spec-or-adr.md` - [What this doc decides/defines that's relevant]
+- `path/to/doc.md` §N - [Specific section and what it covers]
+
+### [Topic area 2]
+- `path/to/feature-doc.md` - [What capability this defines]
+
+[If the project has no external specs: "No external specs - requirements are fully captured in decisions above"]
+
+</canonical_refs>
+
+<code_context>
+## Existing Code Insights
+
+### Reusable Assets
+- [Component/hook/utility]: [How it could be used in this phase]
+
+### Established Patterns
+- [Pattern]: [How it constrains/enables this phase]
+
+### Integration Points
+- [Where new code connects to existing system]
+
+</code_context>
+
+<deferred>
+## Deferred Ideas
+
+[Ideas that came up during discussion but belong in other phases. Captured here so they're not lost, but explicitly out of scope for this phase.]
+
+[If none: "None - discussion stayed within phase scope"]
+
+</deferred>
+
+---
+
+*Phase: XX-name*
+*Context gathered: [date]*
+```
+
+<good_examples>
+
+**Example 1: Visual feature (Post Feed)**
+
+```markdown
+# Phase 3: Post Feed - Context
+
+**Gathered:** 2025-01-20
+**Status:** Ready for planning
+
+<domain>
+## Phase Boundary
+
+Display posts from followed users in a scrollable feed. Users can view posts and see engagement counts. Creating posts and interactions are separate phases.
+
+</domain>
+
+<decisions>
+## Implementation Decisions
+
+### Layout style
+- Card-based layout, not timeline or list
+- Each card shows: author avatar, name, timestamp, full post content, reaction counts
+- Cards have subtle shadows, rounded corners - modern feel
+
+### Loading behavior
+- Infinite scroll, not pagination
+- Pull-to-refresh on mobile
+- New posts indicator at top ("3 new posts") rather than auto-inserting
+
+### Empty state
+- Friendly illustration + "Follow people to see posts here"
+- Suggest 3-5 accounts to follow based on interests
+
+### the agent's Discretion
+- Loading skeleton design
+- Exact spacing and typography
+- Error state handling
+
+</decisions>
+
+<canonical_refs>
+## Canonical References
+
+### Feed display
+- `docs/features/social-feed.md` - Feed requirements, post card fields, engagement display rules
+- `docs/decisions/adr-012-infinite-scroll.md` - Scroll strategy decision, virtualization requirements
+
+### Empty states
+- `docs/design/empty-states.md` - Empty state patterns, illustration guidelines
+
+</canonical_refs>
+
+<specifics>
+## Specific Ideas
+
+- "I like how Twitter shows the new posts indicator without disrupting your scroll position"
+- Cards should feel like Linear's issue cards - clean, not cluttered
+
+</specifics>
+
+<deferred>
+## Deferred Ideas
+
+- Commenting on posts - Phase 5
+- Bookmarking posts - add to backlog
+
+</deferred>
+
+---
+
+*Phase: 03-post-feed*
+*Context gathered: 2025-01-20*
+```
+
+**Example 2: CLI tool (Database backup)**
+
+```markdown
+# Phase 2: Backup Command - Context
+
+**Gathered:** 2025-01-20
+**Status:** Ready for planning
+
+<domain>
+## Phase Boundary
+
+CLI command to backup database to local file or S3. Supports full and incremental backups. Restore command is a separate phase.
+
+</domain>
+
+<decisions>
+## Implementation Decisions
+
+### Output format
+- JSON for programmatic use, table format for humans
+- Default to table, --json flag for JSON
+- Verbose mode (-v) shows progress, silent by default
+
+### Flag design
+- Short flags for common options: -o (output), -v (verbose), -f (force)
+- Long flags for clarity: --incremental, --compress, --encrypt
+- Required: database connection string (positional or --db)
+
+### Error recovery
+- Retry 3 times on network failure, then fail with clear message
+- --no-retry flag to fail fast
+- Partial backups are deleted on failure (no corrupt files)
+
+### the agent's Discretion
+- Exact progress bar implementation
+- Compression algorithm choice
+- Temp file handling
+
+</decisions>
+
+<canonical_refs>
+## Canonical References
+
+### Backup CLI
+- `docs/features/backup-restore.md` - Backup requirements, supported backends, encryption spec
+- `docs/decisions/adr-007-cli-conventions.md` - Flag naming, exit codes, output format standards
+
+</canonical_refs>
+
+<specifics>
+## Specific Ideas
+
+- "I want it to feel like pg_dump - familiar to database people"
+- Should work in CI pipelines (exit codes, no interactive prompts)
+
+</specifics>
+
+<deferred>
+## Deferred Ideas
+
+- Scheduled backups - separate phase
+- Backup rotation/retention - add to backlog
+
+</deferred>
+
+---
+
+*Phase: 02-backup-command*
+*Context gathered: 2025-01-20*
+```
+
+**Example 3: Organization task (Photo library)**
+
+```markdown
+# Phase 1: Photo Organization - Context
+
+**Gathered:** 2025-01-20
+**Status:** Ready for planning
+
+<domain>
+## Phase Boundary
+
+Organize existing photo library into structured folders. Handle duplicates and apply consistent naming. Tagging and search are separate phases.
+
+</domain>
+
+<decisions>
+## Implementation Decisions
+
+### Grouping criteria
+- Primary grouping by year, then by month
+- Events detected by time clustering (photos within 2 hours = same event)
+- Event folders named by date + location if available
+
+### Duplicate handling
+- Keep highest resolution version
+- Move duplicates to _duplicates folder (don't delete)
+- Log all duplicate decisions for review
+
+### Naming convention
+- Format: YYYY-MM-DD_HH-MM-SS_originalname.ext
+- Preserve original filename as suffix for searchability
+- Handle name collisions with incrementing suffix
+
+### the agent's Discretion
+- Exact clustering algorithm
+- How to handle photos with no EXIF data
+- Folder emoji usage
+
+</decisions>
+
+<canonical_refs>
+## Canonical References
+
+### Organization rules
+- `docs/features/photo-organization.md` - Grouping rules, duplicate policy, naming spec
+- `docs/decisions/adr-003-exif-handling.md` - EXIF extraction strategy, fallback for missing metadata
+
+</canonical_refs>
+
+<specifics>
+## Specific Ideas
+
+- "I want to be able to find photos by roughly when they were taken"
+- Don't delete anything - worst case, move to a review folder
+
+</specifics>
+
+<deferred>
+## Deferred Ideas
+
+- Face detection grouping - future phase
+- Cloud sync - out of scope for now
+
+</deferred>
+
+---
+
+*Phase: 01-photo-organization*
+*Context gathered: 2025-01-20*
+```
+
+</good_examples>
+
+<guidelines>
+**This template captures DECISIONS for downstream agents.**
+
+The output should answer: "What does the researcher need to investigate? What choices are locked for the planner?"
+
+**Good content (concrete decisions):**
+- "Card-based layout, not timeline"
+- "Retry 3 times on network failure, then fail"
+- "Group by year, then by month"
+- "JSON for programmatic use, table for humans"
+
+**Bad content (too vague):**
+- "Should feel modern and clean"
+- "Good user experience"
+- "Fast and responsive"
+- "Easy to use"
+
+**After creation:**
+- File lives in phase directory: `.planning/phases/XX-name/{phase_num}-CONTEXT.md`
+- `gsd-phase-researcher` uses decisions to focus investigation AND reads canonical_refs to know WHAT docs to study
+- `gsd-planner` uses decisions + research to create executable tasks AND reads canonical_refs to verify alignment
+- Downstream agents should NOT need to ask the user again about captured decisions
+
+**CRITICAL - Canonical references:**
+- The `<canonical_refs>` section is MANDATORY. Every CONTEXT.md must have one.
+- If your project has external specs, ADRs, or design docs, list them with full relative paths grouped by topic
+- If ROADMAP.md lists `Canonical refs:` per phase, extract and expand those
+- Inline mentions like "see ADR-019" scattered in decisions are useless to downstream agents - they need full paths and section references in a dedicated section they can find
+- If no external specs exist, say so explicitly - don't silently omit the section
+</guidelines>
--- a/.pi/gsd/templates/continue-here.md
+++ b/.pi/gsd/templates/continue-here.md
@@ -0,0 +1,78 @@
+# Continue-Here Template
+
+Copy and fill this structure for `.planning/phases/XX-name/.continue-here.md`:
+
+```yaml
+---
+phase: XX-name
+task: 3
+total_tasks: 7
+status: in_progress
+last_updated: 2025-01-15T14:30:00Z
+---
+```
+
+```markdown
+<current_state>
+[Where exactly are we? What's the immediate context?]
+</current_state>
+
+<completed_work>
+[What got done this session - be specific]
+
+- Task 1: [name] - Done
+- Task 2: [name] - Done
+- Task 3: [name] - In progress, [what's done on it]
+</completed_work>
+
+<remaining_work>
+[What's left in this phase]
+
+- Task 3: [name] - [what's left to do]
+- Task 4: [name] - Not started
+- Task 5: [name] - Not started
+</remaining_work>
+
+<decisions_made>
+[Key decisions and why - so next session doesn't re-debate]
+
+- Decided to use [X] because [reason]
+- Chose [approach] over [alternative] because [reason]
+</decisions_made>
+
+<blockers>
+[Anything stuck or waiting on external factors]
+
+- [Blocker 1]: [status/workaround]
+</blockers>
+
+<context>
+[Mental state, "vibe", anything that helps resume smoothly]
+
+[What were you thinking about? What was the plan?
+This is the "pick up exactly where you left off" context.]
+</context>
+
+<next_action>
+[The very first thing to do when resuming]
+
+Start with: [specific action]
+</next_action>
+```
+
+<yaml_fields>
+Required YAML frontmatter:
+
+- `phase`: Directory name (e.g., `02-authentication`)
+- `task`: Current task number
+- `total_tasks`: How many tasks in phase
+- `status`: `in_progress`, `blocked`, `almost_done`
+- `last_updated`: ISO timestamp
+</yaml_fields>
+
+<guidelines>
+- Be specific enough that a fresh the agent instance understands immediately
+- Include WHY decisions were made, not just what
+- The `<next_action>` should be actionable without reading anything else
+- This file gets DELETED after resume - it's not permanent storage
+</guidelines>
--- a/.pi/gsd/templates/copilot-instructions.md
+++ b/.pi/gsd/templates/copilot-instructions.md
@@ -0,0 +1,7 @@
+# Instructions for GSD
+
+- Use the get-shit-done skill when the user asks for GSD or uses a `gsd-*` command.
+- Treat `/gsd-...` or `gsd-...` as command invocations and load the matching file from `.github/skills/gsd-*`.
+- When a command says to spawn a subagent, prefer a matching custom agent from `.github/agents`.
+- Do not apply GSD workflows unless the user explicitly asks for them.
+- After completing any `gsd-*` command (or any deliverable it triggers: feature, bug fix, tests, docs, etc.), ALWAYS: (1) offer the user the next step by prompting via `ask_user`; repeat this feedback loop until the user explicitly indicates they are done.
--- a/.pi/gsd/templates/debug-subagent-prompt.md
+++ b/.pi/gsd/templates/debug-subagent-prompt.md
@@ -0,0 +1,91 @@
+# Debug Subagent Prompt Template
+
+Template for spawning gsd-debugger agent. The agent contains all debugging expertise - this template provides problem context only.
+
+---
+
+## Template
+
+```markdown
+<objective>
+Investigate issue: {issue_id}
+
+**Summary:** {issue_summary}
+</objective>
+
+<symptoms>
+expected: {expected}
+actual: {actual}
+errors: {errors}
+reproduction: {reproduction}
+timeline: {timeline}
+</symptoms>
+
+<mode>
+symptoms_prefilled: {true_or_false}
+goal: {find_root_cause_only | find_and_fix}
+</mode>
+
+<debug_file>
+Create: .planning/debug/{slug}.md
+</debug_file>
+```
+
+---
+
+## Placeholders
+
+| Placeholder | Source | Example |
+|-------------|--------|---------|
+| `{issue_id}` | Orchestrator-assigned | `auth-screen-dark` |
+| `{issue_summary}` | User description | `Auth screen is too dark` |
+| `{expected}` | From symptoms | `See logo clearly` |
+| `{actual}` | From symptoms | `Screen is dark` |
+| `{errors}` | From symptoms | `None in console` |
+| `{reproduction}` | From symptoms | `Open /auth page` |
+| `{timeline}` | From symptoms | `After recent deploy` |
+| `{goal}` | Orchestrator sets | `find_and_fix` |
+| `{slug}` | Generated | `auth-screen-dark` |
+
+---
+
+## Usage
+
+**From /gsd-debug:**
+```python
+Task(
+  prompt=filled_template,
+  subagent_type="gsd-debugger",
+  description="Debug {slug}"
+)
+```
+
+**From diagnose-issues (UAT):**
+```python
+Task(prompt=template, subagent_type="gsd-debugger", description="Debug UAT-001")
+```
+
+---
+
+## Continuation
+
+For checkpoints, spawn fresh agent with:
+
+```markdown
+<objective>
+Continue debugging {slug}. Evidence is in the debug file.
+</objective>
+
+<prior_state>
+Debug file: @.planning/debug/{slug}.md
+</prior_state>
+
+<checkpoint_response>
+**Type:** {checkpoint_type}
+**Response:** {user_response}
+</checkpoint_response>
+
+<mode>
+goal: {goal}
+</mode>
+```
--- a/.pi/gsd/templates/dev-preferences.md
+++ b/.pi/gsd/templates/dev-preferences.md
@@ -0,0 +1,21 @@
+---
+description: Load developer preferences into this session
+---
+
+# Developer Preferences
+
+> Generated by GSD on {{generated_at}} from {{data_source}}.
+> Run `/gsd-profile-user --refresh` to regenerate.
+
+## Behavioral Directives
+
+Follow these directives when working with this developer. Higher confidence
+directives should be applied directly. Lower confidence directives should be
+tried with hedging ("Based on your profile, I'll try X -- let me know if
+that's off").
+
+{{behavioral_directives}}
+
+## Stack Preferences
+
+{{stack_preferences}}
--- a/.pi/gsd/templates/discovery.md
+++ b/.pi/gsd/templates/discovery.md
@@ -0,0 +1,146 @@
+# Discovery Template
+
+Template for `.planning/phases/XX-name/DISCOVERY.md` - shallow research for library/option decisions.
+
+**Purpose:** Answer "which library/option should we use" questions during mandatory discovery in plan-phase.
+
+For deep ecosystem research ("how do experts build this"), use `/gsd-research-phase` which produces RESEARCH.md.
+
+---
+
+## File Template
+
+```markdown
+---
+phase: XX-name
+type: discovery
+topic: [discovery-topic]
+---
+
+<session_initialization>
+Before beginning discovery, verify today's date:
+!`date +%Y-%m-%d`
+
+Use this date when searching for "current" or "latest" information.
+Example: If today is 2025-11-22, search for "2025" not "2024".
+</session_initialization>
+
+<discovery_objective>
+Discover [topic] to inform [phase name] implementation.
+
+Purpose: [What decision/implementation this enables]
+Scope: [Boundaries]
+Output: DISCOVERY.md with recommendation
+</discovery_objective>
+
+<discovery_scope>
+<include>
+- [Question to answer]
+- [Area to investigate]
+- [Specific comparison if needed]
+</include>
+
+<exclude>
+- [Out of scope for this discovery]
+- [Defer to implementation phase]
+</exclude>
+</discovery_scope>
+
+<discovery_protocol>
+
+**Source Priority:**
+1. **Context7 MCP** - For library/framework documentation (current, authoritative)
+2. **Official Docs** - For platform-specific or non-indexed libraries
+3. **WebSearch** - For comparisons, trends, community patterns (verify all findings)
+
+**Quality Checklist:**
+Before completing discovery, verify:
+- [ ] All claims have authoritative sources (Context7 or official docs)
+- [ ] Negative claims ("X is not possible") verified with official documentation
+- [ ] API syntax/configuration from Context7 or official docs (never WebSearch alone)
+- [ ] WebSearch findings cross-checked with authoritative sources
+- [ ] Recent updates/changelogs checked for breaking changes
+- [ ] Alternative approaches considered (not just first solution found)
+
+**Confidence Levels:**
+- HIGH: Context7 or official docs confirm
+- MEDIUM: WebSearch + Context7/official docs confirm
+- LOW: WebSearch only or training knowledge only (mark for validation)
+
+</discovery_protocol>
+
+
+<output_structure>
+Create `.planning/phases/XX-name/DISCOVERY.md`:
+
+```markdown
+# [Topic] Discovery
+
+## Summary
+[2-3 paragraph executive summary - what was researched, what was found, what's recommended]
+
+## Primary Recommendation
+[What to do and why - be specific and actionable]
+
+## Alternatives Considered
+[What else was evaluated and why not chosen]
+
+## Key Findings
+
+### [Category 1]
+- [Finding with source URL and relevance to our case]
+
+### [Category 2]
+- [Finding with source URL and relevance]
+
+## Code Examples
+[Relevant implementation patterns, if applicable]
+
+## Metadata
+
+<metadata>
+<confidence level="high|medium|low">
+[Why this confidence level - based on source quality and verification]
+</confidence>
+
+<sources>
+- [Primary authoritative sources used]
+</sources>
+
+<open_questions>
+[What couldn't be determined or needs validation during implementation]
+</open_questions>
+
+<validation_checkpoints>
+[If confidence is LOW or MEDIUM, list specific things to verify during implementation]
+</validation_checkpoints>
+</metadata>
+```
+</output_structure>
+
+<success_criteria>
+- All scope questions answered with authoritative sources
+- Quality checklist items completed
+- Clear primary recommendation
+- Low-confidence findings marked with validation checkpoints
+- Ready to inform PLAN.md creation
+</success_criteria>
+
+<guidelines>
+**When to use discovery:**
+- Technology choice unclear (library A vs B)
+- Best practices needed for unfamiliar integration
+- API/library investigation required
+- Single decision pending
+
+**When NOT to use:**
+- Established patterns (CRUD, auth with known library)
+- Implementation details (defer to execution)
+- Questions answerable from existing project context
+
+**When to use RESEARCH.md instead:**
+- Niche/complex domains (3D, games, audio, shaders)
+- Need ecosystem knowledge, not just library choice
+- "How do experts build this" questions
+- Use `/gsd-research-phase` for these
+</guidelines>
--- a/.pi/gsd/templates/discussion-log.md
+++ b/.pi/gsd/templates/discussion-log.md
@@ -0,0 +1,63 @@
+# Discussion Log Template
+
+Template for `.planning/phases/XX-name/{phase_num}-DISCUSSION-LOG.md` - audit trail of discuss-phase Q&A sessions.
+
+**Purpose:** Software audit trail for decision-making. Captures all options considered, not just the selected one. Separate from CONTEXT.md which is the implementation artifact consumed by downstream agents.
+
+**NOT for LLM consumption.** This file should never be referenced in `<files_to_read>` blocks or agent prompts.
+
+## Format
+
+```markdown
+# Phase [X]: [Name] - Discussion Log
+
+> **Audit trail only.** Do not use as input to planning, research, or execution agents.
+> Decisions are captured in CONTEXT.md - this log preserves the alternatives considered.
+
+**Date:** [ISO date]
+**Phase:** [phase number]-[phase name]
+**Areas discussed:** [comma-separated list]
+
+---
+
+## [Area 1 Name]
+
+| Option     | Description         | Selected |
+| ---------- | ------------------- | -------- |
+| [Option 1] | [Brief description] |          |
+| [Option 2] | [Brief description] | ✓        |
+| [Option 3] | [Brief description] |          |
+
+**User's choice:** [Selected option or verbatim free-text response]
+**Notes:** [Any clarifications or rationale provided during discussion]
+
+---
+
+## [Area 2 Name]
+
+...
+
+---
+
+## the agent's Discretion
+
+[Areas delegated to the agent's judgment - list what was deferred and why]
+
+## Deferred Ideas
+
+[Ideas mentioned but not in scope for this phase]
+
+---
+
+*Phase: XX-name*
+*Discussion log generated: [date]*
+```
+
+## Rules
+
+- Generated automatically at end of every discuss-phase session
+- Includes ALL options considered, not just the selected one
+- Includes user's freeform notes and clarifications
+- Clearly marked as audit-only, not an implementation artifact
+- Does NOT interfere with CONTEXT.md generation or downstream agent behavior
+- Committed alongside CONTEXT.md in the same git commit
--- a/.pi/gsd/templates/milestone-archive.md
+++ b/.pi/gsd/templates/milestone-archive.md
@@ -0,0 +1,123 @@
+# Milestone Archive Template
+
+This template is used by the complete-milestone workflow to create archive files in `.planning/milestones/`.
+
+---
+
+## File Template
+
+# Milestone v{{VERSION}}: {{MILESTONE_NAME}}
+
+**Status:** ✅ SHIPPED {{DATE}}
+**Phases:** {{PHASE_START}}-{{PHASE_END}}
+**Total Plans:** {{TOTAL_PLANS}}
+
+## Overview
+
+{{MILESTONE_DESCRIPTION}}
+
+## Phases
+
+{{PHASES_SECTION}}
+
+[For each phase in this milestone, include:]
+
+### Phase {{PHASE_NUM}}: {{PHASE_NAME}}
+
+**Goal**: {{PHASE_GOAL}}
+**Depends on**: {{DEPENDS_ON}}
+**Plans**: {{PLAN_COUNT}} plans
+
+Plans:
+
+- [x] {{PHASE}}-01: {{PLAN_DESCRIPTION}}
+- [x] {{PHASE}}-02: {{PLAN_DESCRIPTION}}
+      [... all plans ...]
+
+**Details:**
+{{PHASE_DETAILS_FROM_ROADMAP}}
+
+**For decimal phases, include (INSERTED) marker:**
+
+### Phase 2.1: Critical Security Patch (INSERTED)
+
+**Goal**: Fix authentication bypass vulnerability
+**Depends on**: Phase 2
+**Plans**: 1 plan
+
+Plans:
+
+- [x] 02.1-01: Patch auth vulnerability
+
+**Details:**
+{{PHASE_DETAILS_FROM_ROADMAP}}
+
+---
+
+## Milestone Summary
+
+**Decimal Phases:**
+
+- Phase 2.1: Critical Security Patch (inserted after Phase 2 for urgent fix)
+- Phase 5.1: Performance Hotfix (inserted after Phase 5 for production issue)
+
+**Key Decisions:**
+{{DECISIONS_FROM_PROJECT_STATE}}
+[Example:]
+
+- Decision: Use ROADMAP.md split (Rationale: Constant context cost)
+- Decision: Decimal phase numbering (Rationale: Clear insertion semantics)
+
+**Issues Resolved:**
+{{ISSUES_RESOLVED_DURING_MILESTONE}}
+[Example:]
+
+- Fixed context overflow at 100+ phases
+- Resolved phase insertion confusion
+
+**Issues Deferred:**
+{{ISSUES_DEFERRED_TO_LATER}}
+[Example:]
+
+- PROJECT-STATE.md tiering (deferred until decisions > 300)
+
+**Technical Debt Incurred:**
+{{SHORTCUTS_NEEDING_FUTURE_WORK}}
+[Example:]
+
+- Some workflows still have hardcoded paths (fix in Phase 5)
+
+---
+
+_For current project status, see .planning/ROADMAP.md_
+
+---
+
+## Usage Guidelines
+
+<guidelines>
+**When to create milestone archives:**
+- After completing all phases in a milestone (v1.0, v1.1, v2.0, etc.)
+- Triggered by complete-milestone workflow
+- Before planning next milestone work
+
+**How to fill template:**
+
+- Replace {{PLACEHOLDERS}} with actual values
+- Extract phase details from ROADMAP.md
+- Document decimal phases with (INSERTED) marker
+- Include key decisions from PROJECT-STATE.md or SUMMARY files
+- List issues resolved vs deferred
+- Capture technical debt for future reference
+
+**Archive location:**
+
+- Save to `.planning/milestones/v{VERSION}-{NAME}.md`
+- Example: `.planning/milestones/v1.0-mvp.md`
+
+**After archiving:**
+
+- Update ROADMAP.md to collapse completed milestone in `<details>` tag
+- Update PROJECT.md to brownfield format with Current State section
+- Continue phase numbering in next milestone (never restart at 01)
+  </guidelines>
--- a/.pi/gsd/templates/milestone-context.md
+++ b/.pi/gsd/templates/milestone-context.md
@@ -0,0 +1,231 @@
+# Milestone Context Template
+
+Template for `.planning/MILESTONE-CONTEXT.md` — captures product scope decisions for an upcoming milestone.
+
+**Purpose:** Document what the milestone should deliver so `/gsd-new-milestone` can start with known intent rather than gathering it inline. Consumed and deleted by `new-milestone` after it generates requirements and a roadmap.
+
+**Key principle:** Product-level only. WHAT users will be able to do — not HOW it will be implemented. Implementation decisions happen in `/gsd-discuss-phase` per phase.
+
+**Downstream consumer:**
+- `new-milestone` — reads `<scope>` for feature scoping, `<constraints>` for requirements boundaries, `<success>` to inform success criteria in ROADMAP.md
+
+---
+
+## File Template
+
+```markdown
+# Milestone Context
+
+**Gathered:** [date]
+**Status:** Ready for /gsd-new-milestone
+
+<milestone_goal>
+## Goal
+
+[One sentence: what this milestone delivers for users]
+
+</milestone_goal>
+
+<scope>
+## Scope
+
+### In this milestone
+
+- **[Capability name]**: [What users can do — one line]
+- **[Capability name]**: [What users can do — one line]
+
+### Explicitly out of scope
+
+- **[Capability name]**: [Reason — "deferred to next milestone", "separate product area", etc.]
+
+[If no explicit exclusions: "No explicit exclusions — boundary is the in-scope list above"]
+
+</scope>
+
+<constraints>
+## Constraints
+
+- [Hard constraint — e.g., "no breaking changes to existing API"]
+- [Hard constraint — e.g., "must work with existing auth system"]
+
+[If none: "None — unconstrained milestone"]
+
+</constraints>
+
+<success>
+## Success Definition
+
+This milestone is successful when:
+- [Observable user outcome — something that can be demoed]
+- [Observable user outcome]
+
+</success>
+
+<open_questions>
+## Open Questions for Planning
+
+- [Question to resolve early in new-milestone or research]
+
+[If none: "None — scope is clear"]
+
+</open_questions>
+
+---
+
+*Milestone context gathered: [date]*
+*Run /gsd-new-milestone to start planning*
+```
+
+<good_examples>
+
+**Example 1: SaaS product — adding collaboration**
+
+```markdown
+# Milestone Context
+
+**Gathered:** 2025-03-15
+**Status:** Ready for /gsd-new-milestone
+
+<milestone_goal>
+## Goal
+
+Users can invite teammates and collaborate on projects in real time.
+
+</milestone_goal>
+
+<scope>
+## Scope
+
+### In this milestone
+
+- **Invite by email**: User can send invites to teammates by email address
+- **Role-based access**: Owner, editor, and viewer roles with clear permission boundaries
+- **Shared project view**: Teammates see the same project state with live updates
+- **Activity feed**: Users can see who changed what and when
+
+### Explicitly out of scope
+
+- **SSO / SAML**: Enterprise auth deferred to v2.0
+- **Guest links**: Public sharing without accounts — separate product decision needed
+
+</scope>
+
+<constraints>
+## Constraints
+
+- No breaking changes to existing project data model — solo users must not need to migrate
+- Invite emails must go through existing SendGrid integration (no new email provider)
+
+</constraints>
+
+<success>
+## Success Definition
+
+This milestone is successful when:
+- A user can invite a colleague and both see the same project within 60 seconds
+- A viewer cannot accidentally edit or delete content
+
+</success>
+
+<open_questions>
+## Open Questions for Planning
+
+- Should activity feed be real-time (websocket) or polling? Affects architecture phase ordering.
+- What happens to a project if the owner deletes their account?
+
+</open_questions>
+
+---
+
+*Milestone context gathered: 2025-03-15*
+*Run /gsd-new-milestone to start planning*
+```
+
+**Example 2: CLI tool — v1.1 reliability release**
+
+```markdown
+# Milestone Context
+
+**Gathered:** 2025-04-01
+**Status:** Ready for /gsd-new-milestone
+
+<milestone_goal>
+## Goal
+
+The backup CLI is reliable enough for unattended production use.
+
+</milestone_goal>
+
+<scope>
+## Scope
+
+### In this milestone
+
+- **Retry with backoff**: Transient network failures retry automatically, not silently fail
+- **Structured logging**: Machine-readable log output for monitoring integration
+- **Config file support**: Users can set defaults in a config file, not just flags
+- **Dry-run mode**: Users can preview what would be backed up before committing
+
+### Explicitly out of scope
+
+- **Restore command**: Planned for v1.2
+- **S3 backend**: Deferred — local filesystem only for now
+
+</scope>
+
+<constraints>
+## Constraints
+
+- Must remain backwards compatible with v1.0 flag interface — existing scripts must not break
+- No new runtime dependencies (Node built-ins only)
+
+</constraints>
+
+<success>
+## Success Definition
+
+This milestone is successful when:
+- A backup job can run overnight on a cron without manual intervention
+- A failed run produces a log entry that tells an ops engineer exactly what went wrong
+
+</success>
+
+<open_questions>
+## Open Questions for Planning
+
+- Should config file use TOML, JSON, or dotenv format? Research common CLI conventions.
+
+</open_questions>
+
+---
+
+*Milestone context gathered: 2025-04-01*
+*Run /gsd-new-milestone to start planning*
+```
+
+</good_examples>
+
+<guidelines>
+**What makes a good MILESTONE-CONTEXT.md:**
+
+Good goal (specific, user-observable):
+- "Users can invite teammates and collaborate on projects in real time."
+- "The backup CLI is reliable enough for unattended production use."
+
+Bad goal (too vague):
+- "Improve collaboration features"
+- "Make things more reliable"
+
+Good scope item (user action):
+- "User can invite colleagues by email address"
+- "Dry-run mode previews changes before committing"
+
+Bad scope item (implementation detail):
+- "Add Redis pub/sub for real-time updates"
+- "Refactor retry logic in backup module"
+
+**After creation:**
+- File lives at `.planning/MILESTONE-CONTEXT.md`
+- `new-milestone` reads it in step 2, uses it for requirements scoping, then deletes it
+- It does NOT persist — it's a handoff document, not a record
+</guidelines>
--- a/.pi/gsd/templates/milestone.md
+++ b/.pi/gsd/templates/milestone.md
@@ -0,0 +1,115 @@
+# Milestone Entry Template
+
+Add this entry to `.planning/MILESTONES.md` when completing a milestone:
+
+```markdown
+## v[X.Y] [Name] (Shipped: YYYY-MM-DD)
+
+**Delivered:** [One sentence describing what shipped]
+
+**Phases completed:** [X-Y] ([Z] plans total)
+
+**Key accomplishments:**
+- [Major achievement 1]
+- [Major achievement 2]
+- [Major achievement 3]
+- [Major achievement 4]
+
+**Stats:**
+- [X] files created/modified
+- [Y] lines of code (primary language)
+- [Z] phases, [N] plans, [M] tasks
+- [D] days from start to ship (or milestone to milestone)
+
+**Git range:** `feat(XX-XX)` → `feat(YY-YY)`
+
+**What's next:** [Brief description of next milestone goals, or "Project complete"]
+
+---
+```
+
+<structure>
+If MILESTONES.md doesn't exist, create it with header:
+
+```markdown
+# Project Milestones: [Project Name]
+
+[Entries in reverse chronological order - newest first]
+```
+</structure>
+
+<guidelines>
+**When to create milestones:**
+- Initial v1.0 MVP shipped
+- Major version releases (v2.0, v3.0)
+- Significant feature milestones (v1.1, v1.2)
+- Before archiving planning (capture what was shipped)
+
+**Don't create milestones for:**
+- Individual phase completions (normal workflow)
+- Work in progress (wait until shipped)
+- Minor bug fixes that don't constitute a release
+
+**Stats to include:**
+- Count modified files: `git diff --stat feat(XX-XX)..feat(YY-YY) | tail -1`
+- Count LOC: `find . -name "*.swift" -o -name "*.ts" | xargs wc -l` (or relevant extension)
+- Phase/plan/task counts from ROADMAP
+- Timeline from first phase commit to last phase commit
+
+**Git range format:**
+- First commit of milestone → last commit of milestone
+- Example: `feat(01-01)` → `feat(04-01)` for phases 1-4
+</guidelines>
+
+<example>
+```markdown
+# Project Milestones: WeatherBar
+
+## v1.1 Security & Polish (Shipped: 2025-12-10)
+
+**Delivered:** Security hardening with Keychain integration and comprehensive error handling
+
+**Phases completed:** 5-6 (3 plans total)
+
+**Key accomplishments:**
+- Migrated API key storage from plaintext to macOS Keychain
+- Implemented comprehensive error handling for network failures
+- Added Sentry crash reporting integration
+- Fixed memory leak in auto-refresh timer
+
+**Stats:**
+- 23 files modified
+- 650 lines of Swift added
+- 2 phases, 3 plans, 12 tasks
+- 8 days from v1.0 to v1.1
+
+**Git range:** `feat(05-01)` → `feat(06-02)`
+
+**What's next:** v2.0 SwiftUI redesign with widget support
+
+---
+
+## v1.0 MVP (Shipped: 2025-11-25)
+
+**Delivered:** Menu bar weather app with current conditions and 3-day forecast
+
+**Phases completed:** 1-4 (7 plans total)
+
+**Key accomplishments:**
+- Menu bar app with popover UI (AppKit)
+- OpenWeather API integration with auto-refresh
+- Current weather display with conditions icon
+- 3-day forecast list with high/low temperatures
+- Code signed and notarized for distribution
+
+**Stats:**
+- 47 files created
+- 2,450 lines of Swift
+- 4 phases, 7 plans, 28 tasks
+- 12 days from start to ship
+
+**Git range:** `feat(01-01)` → `feat(04-01)`
+
+**What's next:** Security audit and hardening for v1.1
+```
+</example>
--- a/.pi/gsd/templates/phase-prompt.md
+++ b/.pi/gsd/templates/phase-prompt.md
@@ -0,0 +1,610 @@
+# Phase Prompt Template
+
+> **Note:** Planning methodology is in `agents/gsd-planner.md`.
+> This template defines the PLAN.md output format that the agent produces.
+
+Template for `.planning/phases/XX-name/{phase}-{plan}-PLAN.md` - executable phase plans optimized for parallel execution.
+
+**Naming:** Use `{phase}-{plan}-PLAN.md` format (e.g., `01-02-PLAN.md` for Phase 1, Plan 2)
+
+---
+
+## File Template
+
+```markdown
+---
+phase: XX-name
+plan: NN
+type: execute
+wave: N                     # Execution wave (1, 2, 3...). Pre-computed at plan time.
+depends_on: []              # Plan IDs this plan requires (e.g., ["01-01"]).
+files_modified: []          # Files this plan modifies.
+autonomous: true            # false if plan has checkpoints requiring user interaction
+requirements: []            # REQUIRED - Requirement IDs from ROADMAP this plan addresses. MUST NOT be empty.
+user_setup: []              # Human-required setup the agent cannot automate (see below)
+
+# Goal-backward verification (derived during planning, verified after execution)
+must_haves:
+  truths: []                # Observable behaviors that must be true for goal achievement
+  artifacts: []             # Files that must exist with real implementation
+  key_links: []             # Critical connections between artifacts
+---
+
+<objective>
+[What this plan accomplishes]
+
+Purpose: [Why this matters for the project]
+Output: [What artifacts will be created]
+</objective>
+
+<execution_context>
+@.pi/gsd/workflows/execute-plan.md
+@.pi/gsd/templates/summary.md
+[If plan contains checkpoint tasks (type="checkpoint:*"), add:]
+@.pi/gsd/references/checkpoints.md
+</execution_context>
+
+<context>
+@.planning/PROJECT.md
+@.planning/ROADMAP.md
+@.planning/STATE.md
+
+# Only reference prior plan SUMMARYs if genuinely needed:
+# - This plan uses types/exports from prior plan
+# - Prior plan made decision that affects this plan
+# Do NOT reflexively chain: Plan 02 refs 01, Plan 03 refs 02...
+
+[Relevant source files:]
+@src/path/to/relevant.ts
+</context>
+
+<tasks>
+
+<task type="auto">
+  <name>Task 1: [Action-oriented name]</name>
+  <files>path/to/file.ext, another/file.ext</files>
+  <read_first>path/to/reference.ext, path/to/source-of-truth.ext</read_first>
+  <action>[Specific implementation - what to do, how to do it, what to avoid and WHY. Include CONCRETE values: exact identifiers, parameters, expected outputs, file paths, command arguments. Never say "align X with Y" without specifying the exact target state.]</action>
+  <verify>[Command or check to prove it worked]</verify>
+  <acceptance_criteria>
+    - [Grep-verifiable condition: "file.ext contains 'exact string'"]
+    - [Measurable condition: "output.ext uses 'expected-value', NOT 'wrong-value'"]
+  </acceptance_criteria>
+  <done>[Measurable acceptance criteria]</done>
+</task>
+
+<task type="auto">
+  <name>Task 2: [Action-oriented name]</name>
+  <files>path/to/file.ext</files>
+  <read_first>path/to/reference.ext</read_first>
+  <action>[Specific implementation with concrete values]</action>
+  <verify>[Command or check]</verify>
+  <acceptance_criteria>
+    - [Grep-verifiable condition]
+  </acceptance_criteria>
+  <done>[Acceptance criteria]</done>
+</task>
+
+<!-- For checkpoint task examples and patterns, see @.pi/gsd/references/checkpoints.md -->
+
+<task type="checkpoint:decision" gate="blocking">
+  <decision>[What needs deciding]</decision>
+  <context>[Why this decision matters]</context>
+  <options>
+    <option id="option-a"><name>[Name]</name><pros>[Benefits]</pros><cons>[Tradeoffs]</cons></option>
+    <option id="option-b"><name>[Name]</name><pros>[Benefits]</pros><cons>[Tradeoffs]</cons></option>
+  </options>
+  <resume-signal>Select: option-a or option-b</resume-signal>
+</task>
+
+<task type="checkpoint:human-verify" gate="blocking">
+  <what-built>[What the agent built] - server running at [URL]</what-built>
+  <how-to-verify>Visit [URL] and verify: [visual checks only, NO CLI commands]</how-to-verify>
+  <resume-signal>Type "approved" or describe issues</resume-signal>
+</task>
+
+</tasks>
+
+<verification>
+Before declaring plan complete:
+- [ ] [Specific test command]
+- [ ] [Build/type check passes]
+- [ ] [Behavior verification]
+</verification>
+
+<success_criteria>
+
+- All tasks completed
+- All verification checks pass
+- No errors or warnings introduced
+- [Plan-specific criteria]
+  </success_criteria>
+
+<output>
+After completion, create `.planning/phases/XX-name/{phase}-{plan}-SUMMARY.md`
+</output>
+```
+
+---
+
+## Frontmatter Fields
+
+| Field            | Required | Purpose                                                                                                 |
+| ---------------- | -------- | ------------------------------------------------------------------------------------------------------- |
+| `phase`          | Yes      | Phase identifier (e.g., `01-foundation`)                                                                |
+| `plan`           | Yes      | Plan number within phase (e.g., `01`, `02`)                                                             |
+| `type`           | Yes      | Always `execute` for standard plans, `tdd` for TDD plans                                                |
+| `wave`           | Yes      | Execution wave number (1, 2, 3...). Pre-computed at plan time.                                          |
+| `depends_on`     | Yes      | Array of plan IDs this plan requires.                                                                   |
+| `files_modified` | Yes      | Files this plan touches.                                                                                |
+| `autonomous`     | Yes      | `true` if no checkpoints, `false` if has checkpoints                                                    |
+| `requirements`   | Yes      | **MUST** list requirement IDs from ROADMAP. Every roadmap requirement MUST appear in at least one plan. |
+| `user_setup`     | No       | Array of human-required setup items (external services)                                                 |
+| `must_haves`     | Yes      | Goal-backward verification criteria (see below)                                                         |
+
+**Wave is pre-computed:** Wave numbers are assigned during `/gsd-plan-phase`. Execute-phase reads `wave` directly from frontmatter and groups plans by wave number. No runtime dependency analysis needed.
+
+**Must-haves enable verification:** The `must_haves` field carries goal-backward requirements from planning to execution. After all plans complete, execute-phase spawns a verification subagent that checks these criteria against the actual codebase.
+
+---
+
+## Parallel vs Sequential
+
+<parallel_examples>
+
+**Wave 1 candidates (parallel):**
+
+```yaml
+# Plan 01 - User feature
+wave: 1
+depends_on: []
+files_modified: [src/models/user.ts, src/api/users.ts]
+autonomous: true
+
+# Plan 02 - Product feature (no overlap with Plan 01)
+wave: 1
+depends_on: []
+files_modified: [src/models/product.ts, src/api/products.ts]
+autonomous: true
+
+# Plan 03 - Order feature (no overlap)
+wave: 1
+depends_on: []
+files_modified: [src/models/order.ts, src/api/orders.ts]
+autonomous: true
+```
+
+All three run in parallel (Wave 1) - no dependencies, no file conflicts.
+
+**Sequential (genuine dependency):**
+
+```yaml
+# Plan 01 - Auth foundation
+wave: 1
+depends_on: []
+files_modified: [src/lib/auth.ts, src/middleware/auth.ts]
+autonomous: true
+
+# Plan 02 - Protected features (needs auth)
+wave: 2
+depends_on: ["01"]
+files_modified: [src/features/dashboard.ts]
+autonomous: true
+```
+
+Plan 02 in Wave 2 waits for Plan 01 in Wave 1 - genuine dependency on auth types/middleware.
+
+**Checkpoint plan:**
+
+```yaml
+# Plan 03 - UI with verification
+wave: 3
+depends_on: ["01", "02"]
+files_modified: [src/components/Dashboard.tsx]
+autonomous: false  # Has checkpoint:human-verify
+```
+
+Wave 3 runs after Waves 1 and 2. Pauses at checkpoint, orchestrator presents to user, resumes on approval.
+
+</parallel_examples>
+
+---
+
+## Context Section
+
+**Parallel-aware context:**
+
+```markdown
+<context>
+@.planning/PROJECT.md
+@.planning/ROADMAP.md
+@.planning/STATE.md
+
+# Only include SUMMARY refs if genuinely needed:
+# - This plan imports types from prior plan
+# - Prior plan made decision affecting this plan
+# - Prior plan's output is input to this plan
+#
+# Independent plans need NO prior SUMMARY references.
+# Do NOT reflexively chain: 02 refs 01, 03 refs 02...
+
+@src/relevant/source.ts
+</context>
+```
+
+**Bad pattern (creates false dependencies):**
+```markdown
+<context>
+@.planning/phases/03-features/03-01-SUMMARY.md  # Just because it's earlier
+@.planning/phases/03-features/03-02-SUMMARY.md  # Reflexive chaining
+</context>
+```
+
+---
+
+## Scope Guidance
+
+**Plan sizing:**
+
+- 2-3 tasks per plan
+- ~50% context usage maximum
+- Complex phases: Multiple focused plans, not one large plan
+
+**When to split:**
+
+- Different subsystems (auth vs API vs UI)
+- >3 tasks
+- Risk of context overflow
+- TDD candidates - separate plans
+
+**Vertical slices preferred:**
+
+```
+PREFER: Plan 01 = User (model + API + UI)
+        Plan 02 = Product (model + API + UI)
+
+AVOID:  Plan 01 = All models
+        Plan 02 = All APIs
+        Plan 03 = All UIs
+```
+
+---
+
+## TDD Plans
+
+TDD features get dedicated plans with `type: tdd`.
+
+**Heuristic:** Can you write `expect(fn(input)).toBe(output)` before writing `fn`?
+→ Yes: Create a TDD plan
+→ No: Standard task in standard plan
+
+See `.pi/gsd/references/tdd.md` for TDD plan structure.
+
+---
+
+## Task Types
+
+| Type                      | Use For                                   | Autonomy                        |
+| ------------------------- | ----------------------------------------- | ------------------------------- |
+| `auto`                    | Everything the agent can do independently | Fully autonomous                |
+| `checkpoint:human-verify` | Visual/functional verification            | Pauses, returns to orchestrator |
+| `checkpoint:decision`     | Implementation choices                    | Pauses, returns to orchestrator |
+| `checkpoint:human-action` | Truly unavoidable manual steps (rare)     | Pauses, returns to orchestrator |
+
+**Checkpoint behavior in parallel execution:**
+- Plan runs until checkpoint
+- Agent returns with checkpoint details + agent_id
+- Orchestrator presents to user
+- User responds
+- Orchestrator resumes agent with `resume: agent_id`
+
+---
+
+## Examples
+
+**Autonomous parallel plan:**
+
+```markdown
+---
+phase: 03-features
+plan: 01
+type: execute
+wave: 1
+depends_on: []
+files_modified: [src/features/user/model.ts, src/features/user/api.ts, src/features/user/UserList.tsx]
+autonomous: true
+---
+
+<objective>
+Implement complete User feature as vertical slice.
+
+Purpose: Self-contained user management that can run parallel to other features.
+Output: User model, API endpoints, and UI components.
+</objective>
+
+<context>
+@.planning/PROJECT.md
+@.planning/ROADMAP.md
+@.planning/STATE.md
+</context>
+
+<tasks>
+<task type="auto">
+  <name>Task 1: Create User model</name>
+  <files>src/features/user/model.ts</files>
+  <action>Define User type with id, email, name, createdAt. Export TypeScript interface.</action>
+  <verify>tsc --noEmit passes</verify>
+  <done>User type exported and usable</done>
+</task>
+
+<task type="auto">
+  <name>Task 2: Create User API endpoints</name>
+  <files>src/features/user/api.ts</files>
+  <action>GET /users (list), GET /users/:id (single), POST /users (create). Use User type from model.</action>
+  <verify>fetch tests pass for all endpoints</verify>
+  <done>All CRUD operations work</done>
+</task>
+</tasks>
+
+<verification>
+- [ ] npm run build succeeds
+- [ ] API endpoints respond correctly
+</verification>
+
+<success_criteria>
+- All tasks completed
+- User feature works end-to-end
+</success_criteria>
+
+<output>
+After completion, create `.planning/phases/03-features/03-01-SUMMARY.md`
+</output>
+```
+
+**Plan with checkpoint (non-autonomous):**
+
+```markdown
+---
+phase: 03-features
+plan: 03
+type: execute
+wave: 2
+depends_on: ["03-01", "03-02"]
+files_modified: [src/components/Dashboard.tsx]
+autonomous: false
+---
+
+<objective>
+Build dashboard with visual verification.
+
+Purpose: Integrate user and product features into unified view.
+Output: Working dashboard component.
+</objective>
+
+<execution_context>
+@.pi/gsd/workflows/execute-plan.md
+@.pi/gsd/templates/summary.md
+@.pi/gsd/references/checkpoints.md
+</execution_context>
+
+<context>
+@.planning/PROJECT.md
+@.planning/ROADMAP.md
+@.planning/phases/03-features/03-01-SUMMARY.md
+@.planning/phases/03-features/03-02-SUMMARY.md
+</context>
+
+<tasks>
+<task type="auto">
+  <name>Task 1: Build Dashboard layout</name>
+  <files>src/components/Dashboard.tsx</files>
+  <action>Create responsive grid with UserList and ProductList components. Use Tailwind for styling.</action>
+  <verify>npm run build succeeds</verify>
+  <done>Dashboard renders without errors</done>
+</task>
+
+<!-- Checkpoint pattern: the agent starts server, user visits URL. See checkpoints.md for full patterns. -->
+<task type="auto">
+  <name>Start dev server</name>
+  <action>Run `npm run dev` in background, wait for ready</action>
+  <verify>fetch http://localhost:3000 returns 200</verify>
+</task>
+
+<task type="checkpoint:human-verify" gate="blocking">
+  <what-built>Dashboard - server at http://localhost:3000</what-built>
+  <how-to-verify>Visit localhost:3000/dashboard. Check: desktop grid, mobile stack, no scroll issues.</how-to-verify>
+  <resume-signal>Type "approved" or describe issues</resume-signal>
+</task>
+</tasks>
+
+<verification>
+- [ ] npm run build succeeds
+- [ ] Visual verification passed
+</verification>
+
+<success_criteria>
+- All tasks completed
+- User approved visual layout
+</success_criteria>
+
+<output>
+After completion, create `.planning/phases/03-features/03-03-SUMMARY.md`
+</output>
+```
+
+---
+
+## Anti-Patterns
+
+**Bad: Reflexive dependency chaining**
+```yaml
+depends_on: ["03-01"]  # Just because 01 comes before 02
+```
+
+**Bad: Horizontal layer grouping**
+```
+Plan 01: All models
+Plan 02: All APIs (depends on 01)
+Plan 03: All UIs (depends on 02)
+```
+
+**Bad: Missing autonomy flag**
+```yaml
+# Has checkpoint but no autonomous: false
+depends_on: []
+files_modified: [...]
+# autonomous: ???  <- Missing!
+```
+
+**Bad: Vague tasks**
+```xml
+<task type="auto">
+  <name>Set up authentication</name>
+  <action>Add auth to the app</action>
+</task>
+```
+
+**Bad: Missing read_first (executor modifies files it hasn't read)**
+```xml
+<task type="auto">
+  <name>Update database config</name>
+  <files>src/config/database.ts</files>
+  <!-- No read_first! Executor doesn't know current state or conventions -->
+  <action>Update the database config to match production settings</action>
+</task>
+```
+
+**Bad: Vague acceptance criteria (not verifiable)**
+```xml
+<acceptance_criteria>
+  - Config is properly set up
+  - Database connection works correctly
+</acceptance_criteria>
+```
+
+**Good: Concrete with read_first + verifiable criteria**
+```xml
+<task type="auto">
+  <name>Update database config for connection pooling</name>
+  <files>src/config/database.ts</files>
+  <read_first>src/config/database.ts, .env.example, docker-compose.yml</read_first>
+  <action>Add pool configuration: min=2, max=20, idleTimeoutMs=30000. Add SSL config: rejectUnauthorized=true when NODE_ENV=production. Add .env.example entry: DATABASE_POOL_MAX=20.</action>
+  <acceptance_criteria>
+    - database.ts contains "max: 20" and "idleTimeoutMillis: 30000"
+    - database.ts contains SSL conditional on NODE_ENV
+    - .env.example contains DATABASE_POOL_MAX
+  </acceptance_criteria>
+</task>
+```
+
+---
+
+## Guidelines
+
+- Always use XML structure for the agent parsing
+- Include `wave`, `depends_on`, `files_modified`, `autonomous` in every plan
+- Prefer vertical slices over horizontal layers
+- Only reference prior SUMMARYs when genuinely needed
+- Group checkpoints with related auto tasks in same plan
+- 2-3 tasks per plan, ~50% context max
+
+---
+
+## User Setup (External Services)
+
+When a plan introduces external services requiring human configuration, declare in frontmatter:
+
+```yaml
+user_setup:
+  - service: stripe
+    why: "Payment processing requires API keys"
+    env_vars:
+      - name: STRIPE_SECRET_KEY
+        source: "Stripe Dashboard → Developers → API keys → Secret key"
+      - name: STRIPE_WEBHOOK_SECRET
+        source: "Stripe Dashboard → Developers → Webhooks → Signing secret"
+    dashboard_config:
+      - task: "Create webhook endpoint"
+        location: "Stripe Dashboard → Developers → Webhooks → Add endpoint"
+        details: "URL: https://[your-domain]/api/webhooks/stripe"
+    local_dev:
+      - "stripe listen --forward-to localhost:3000/api/webhooks/stripe"
+```
+
+**The automation-first rule:** `user_setup` contains ONLY what the agent literally cannot do:
+- Account creation (requires human signup)
+- Secret retrieval (requires dashboard access)
+- Dashboard configuration (requires human in browser)
+
+**NOT included:** Package installs, code changes, file creation, CLI commands the agent can run.
+
+**Result:** Execute-plan generates `{phase}-USER-SETUP.md` with checklist for the user.
+
+See `.pi/gsd/templates/user-setup.md` for full schema and examples
+
+---
+
+## Must-Haves (Goal-Backward Verification)
+
+The `must_haves` field defines what must be TRUE for the phase goal to be achieved. Derived during planning, verified after execution.
+
+**Structure:**
+
+```yaml
+must_haves:
+  truths:
+    - "User can see existing messages"
+    - "User can send a message"
+    - "Messages persist across refresh"
+  artifacts:
+    - path: "src/components/Chat.tsx"
+      provides: "Message list rendering"
+      min_lines: 30
+    - path: "src/app/api/chat/route.ts"
+      provides: "Message CRUD operations"
+      exports: ["GET", "POST"]
+    - path: "prisma/schema.prisma"
+      provides: "Message model"
+      contains: "model Message"
+  key_links:
+    - from: "src/components/Chat.tsx"
+      to: "/api/chat"
+      via: "fetch in useEffect"
+      pattern: "fetch.*api/chat"
+    - from: "src/app/api/chat/route.ts"
+      to: "prisma.message"
+      via: "database query"
+      pattern: "prisma\\.message\\.(find|create)"
+```
+
+**Field descriptions:**
+
+| Field                   | Purpose                                                            |
+| ----------------------- | ------------------------------------------------------------------ |
+| `truths`                | Observable behaviors from user perspective. Each must be testable. |
+| `artifacts`             | Files that must exist with real implementation.                    |
+| `artifacts[].path`      | File path relative to project root.                                |
+| `artifacts[].provides`  | What this artifact delivers.                                       |
+| `artifacts[].min_lines` | Optional. Minimum lines to be considered substantive.              |
+| `artifacts[].exports`   | Optional. Expected exports to verify.                              |
+| `artifacts[].contains`  | Optional. Pattern that must exist in file.                         |
+| `key_links`             | Critical connections between artifacts.                            |
+| `key_links[].from`      | Source artifact.                                                   |
+| `key_links[].to`        | Target artifact or endpoint.                                       |
+| `key_links[].via`       | How they connect (description).                                    |
+| `key_links[].pattern`   | Optional. Regex to verify connection exists.                       |
+
+**Why this matters:**
+
+Task completion ≠ Goal achievement. A task "create chat component" can complete by creating a placeholder. The `must_haves` field captures what must actually work, enabling verification to catch gaps before they compound.
+
+**Verification flow:**
+
+1. Plan-phase derives must_haves from phase goal (goal-backward)
+2. Must_haves written to PLAN.md frontmatter
+3. Execute-phase runs all plans
+4. Verification subagent checks must_haves against codebase
+5. Gaps found → fix plans created → execute → re-verify
+6. All must_haves pass → phase complete
+
+See `.pi/gsd/workflows/verify-phase.md` for verification logic.
--- a/.pi/gsd/templates/planner-subagent-prompt.md
+++ b/.pi/gsd/templates/planner-subagent-prompt.md
@@ -0,0 +1,117 @@
+# Planner Subagent Prompt Template
+
+Template for spawning gsd-planner agent. The agent contains all planning expertise - this template provides planning context only.
+
+---
+
+## Template
+
+```markdown
+<planning_context>
+
+**Phase:** {phase_number}
+**Mode:** {standard | gap_closure}
+
+**Project State:**
+@.planning/STATE.md
+
+**Roadmap:**
+@.planning/ROADMAP.md
+
+**Requirements (if exists):**
+@.planning/REQUIREMENTS.md
+
+**Phase Context (if exists):**
+@.planning/phases/{phase_dir}/{phase_num}-CONTEXT.md
+
+**Research (if exists):**
+@.planning/phases/{phase_dir}/{phase_num}-RESEARCH.md
+
+**Gap Closure (if --gaps mode):**
+@.planning/phases/{phase_dir}/{phase_num}-VERIFICATION.md
+@.planning/phases/{phase_dir}/{phase_num}-UAT.md
+
+</planning_context>
+
+<downstream_consumer>
+Output consumed by /gsd-execute-phase
+Plans must be executable prompts with:
+- Frontmatter (wave, depends_on, files_modified, autonomous)
+- Tasks in XML format
+- Verification criteria
+- must_haves for goal-backward verification
+</downstream_consumer>
+
+<quality_gate>
+Before returning PLANNING COMPLETE:
+- [ ] PLAN.md files created in phase directory
+- [ ] Each plan has valid frontmatter
+- [ ] Tasks are specific and actionable
+- [ ] Dependencies correctly identified
+- [ ] Waves assigned for parallel execution
+- [ ] must_haves derived from phase goal
+</quality_gate>
+```
+
+---
+
+## Placeholders
+
+| Placeholder | Source | Example |
+|-------------|--------|---------|
+| `{phase_number}` | From roadmap/arguments | `5` or `2.1` |
+| `{phase_dir}` | Phase directory name | `05-user-profiles` |
+| `{phase}` | Phase prefix | `05` |
+| `{standard \| gap_closure}` | Mode flag | `standard` |
+
+---
+
+## Usage
+
+**From /gsd-plan-phase (standard mode):**
+```python
+Task(
+  prompt=filled_template,
+  subagent_type="gsd-planner",
+  description="Plan Phase {phase}"
+)
+```
+
+**From /gsd-plan-phase --gaps (gap closure mode):**
+```python
+Task(
+  prompt=filled_template,  # with mode: gap_closure
+  subagent_type="gsd-planner",
+  description="Plan gaps for Phase {phase}"
+)
+```
+
+---
+
+## Continuation
+
+For checkpoints, spawn fresh agent with:
+
+```markdown
+<objective>
+Continue planning for Phase {phase_number}: {phase_name}
+</objective>
+
+<prior_state>
+Phase directory: @.planning/phases/{phase_dir}/
+Existing plans: @.planning/phases/{phase_dir}/*-PLAN.md
+</prior_state>
+
+<checkpoint_response>
+**Type:** {checkpoint_type}
+**Response:** {user_response}
+</checkpoint_response>
+
+<mode>
+Continue: {standard | gap_closure}
+</mode>
+```
+
+---
+
+**Note:** Planning methodology, task breakdown, dependency analysis, wave assignment, TDD detection, and goal-backward derivation are baked into the gsd-planner agent. This template only passes context.
--- a/.pi/gsd/templates/project.md
+++ b/.pi/gsd/templates/project.md
@@ -0,0 +1,186 @@
+# PROJECT.md Template
+
+Template for `.planning/PROJECT.md` - the living project context document.
+
+<template>
+
+```markdown
+# [Project Name]
+
+## What This Is
+
+[Current accurate description - 2-3 sentences. What does this product do and who is it for?
+Use the user's language and framing. Update whenever reality drifts from this description.]
+
+## Core Value
+
+[The ONE thing that matters most. If everything else fails, this must work.
+One sentence that drives prioritization when tradeoffs arise.]
+
+## Requirements
+
+### Validated
+
+<!-- Shipped and confirmed valuable. -->
+
+(None yet - ship to validate)
+
+### Active
+
+<!-- Current scope. Building toward these. -->
+
+- [ ] [Requirement 1]
+- [ ] [Requirement 2]
+- [ ] [Requirement 3]
+
+### Out of Scope
+
+<!-- Explicit boundaries. Includes reasoning to prevent re-adding. -->
+
+- [Exclusion 1] - [why]
+- [Exclusion 2] - [why]
+
+## Context
+
+[Background information that informs implementation:
+- Technical environment or ecosystem
+- Relevant prior work or experience
+- User research or feedback themes
+- Known issues to address]
+
+## Constraints
+
+- **[Type]**: [What] - [Why]
+- **[Type]**: [What] - [Why]
+
+Common types: Tech stack, Timeline, Budget, Dependencies, Compatibility, Performance, Security
+
+## Key Decisions
+
+<!-- Decisions that constrain future work. Add throughout project lifecycle. -->
+
+| Decision | Rationale | Outcome                          |
+| -------- | --------- | -------------------------------- |
+| [Choice] | [Why]     | [✓ Good / ⚠️ Revisit / - Pending] |
+
+---
+*Last updated: [date] after [trigger]*
+```
+
+</template>
+
+<guidelines>
+
+**What This Is:**
+- Current accurate description of the product
+- 2-3 sentences capturing what it does and who it's for
+- Use the user's words and framing
+- Update when the product evolves beyond this description
+
+**Core Value:**
+- The single most important thing
+- Everything else can fail; this cannot
+- Drives prioritization when tradeoffs arise
+- Rarely changes; if it does, it's a significant pivot
+
+**Requirements - Validated:**
+- Requirements that shipped and proved valuable
+- Format: `- ✓ [Requirement] - [version/phase]`
+- These are locked - changing them requires explicit discussion
+
+**Requirements - Active:**
+- Current scope being built toward
+- These are hypotheses until shipped and validated
+- Move to Validated when shipped, Out of Scope if invalidated
+
+**Requirements - Out of Scope:**
+- Explicit boundaries on what we're not building
+- Always include reasoning (prevents re-adding later)
+- Includes: considered and rejected, deferred to future, explicitly excluded
+
+**Context:**
+- Background that informs implementation decisions
+- Technical environment, prior work, user feedback
+- Known issues or technical debt to address
+- Update as new context emerges
+
+**Constraints:**
+- Hard limits on implementation choices
+- Tech stack, timeline, budget, compatibility, dependencies
+- Include the "why" - constraints without rationale get questioned
+
+**Key Decisions:**
+- Significant choices that affect future work
+- Add decisions as they're made throughout the project
+- Track outcome when known:
+  - ✓ Good - decision proved correct
+  - ⚠️ Revisit - decision may need reconsideration
+  - - Pending - too early to evaluate
+
+**Last Updated:**
+- Always note when and why the document was updated
+- Format: `after Phase 2` or `after v1.0 milestone`
+- Triggers review of whether content is still accurate
+
+</guidelines>
+
+<evolution>
+
+PROJECT.md evolves throughout the project lifecycle.
+These rules are embedded in the generated PROJECT.md (## Evolution section)
+and implemented by workflows/transition.md and workflows/complete-milestone.md.
+
+**After each phase transition:**
+1. Requirements invalidated? → Move to Out of Scope with reason
+2. Requirements validated? → Move to Validated with phase reference
+3. New requirements emerged? → Add to Active
+4. Decisions to log? → Add to Key Decisions
+5. "What This Is" still accurate? → Update if drifted
+
+**After each milestone:**
+1. Full review of all sections
+2. Core Value check - still the right priority?
+3. Audit Out of Scope - reasons still valid?
+4. Update Context with current state (users, feedback, metrics)
+
+</evolution>
+
+<brownfield>
+
+For existing codebases:
+
+1. **Map codebase first** via `/gsd-map-codebase`
+
+2. **Infer Validated requirements** from existing code:
+   - What does the codebase actually do?
+   - What patterns are established?
+   - What's clearly working and relied upon?
+
+3. **Gather Active requirements** from user:
+   - Present inferred current state
+   - Ask what they want to build next
+
+4. **Initialize:**
+   - Validated = inferred from existing code
+   - Active = user's goals for this work
+   - Out of Scope = boundaries user specifies
+   - Context = includes current codebase state
+
+</brownfield>
+
+<state_reference>
+
+STATE.md references PROJECT.md:
+
+```markdown
+## Project Reference
+
+See: .planning/PROJECT.md (updated [date])
+
+**Core value:** [One-liner from Core Value section]
+**Current focus:** [Current phase name]
+```
+
+This ensures the agent reads current PROJECT.md context.
+
+</state_reference>
--- a/.pi/gsd/templates/requirements.md
+++ b/.pi/gsd/templates/requirements.md
@@ -0,0 +1,231 @@
+# Requirements Template
+
+Template for `.planning/REQUIREMENTS.md` - checkable requirements that define "done."
+
+<template>
+
+```markdown
+# Requirements: [Project Name]
+
+**Defined:** [date]
+**Core Value:** [from PROJECT.md]
+
+## v1 Requirements
+
+Requirements for initial release. Each maps to roadmap phases.
+
+### Authentication
+
+- [ ] **AUTH-01**: User can sign up with email and password
+- [ ] **AUTH-02**: User receives email verification after signup
+- [ ] **AUTH-03**: User can reset password via email link
+- [ ] **AUTH-04**: User session persists across browser refresh
+
+### [Category 2]
+
+- [ ] **[CAT]-01**: [Requirement description]
+- [ ] **[CAT]-02**: [Requirement description]
+- [ ] **[CAT]-03**: [Requirement description]
+
+### [Category 3]
+
+- [ ] **[CAT]-01**: [Requirement description]
+- [ ] **[CAT]-02**: [Requirement description]
+
+## v2 Requirements
+
+Deferred to future release. Tracked but not in current roadmap.
+
+### [Category]
+
+- **[CAT]-01**: [Requirement description]
+- **[CAT]-02**: [Requirement description]
+
+## Out of Scope
+
+Explicitly excluded. Documented to prevent scope creep.
+
+| Feature   | Reason         |
+| --------- | -------------- |
+| [Feature] | [Why excluded] |
+| [Feature] | [Why excluded] |
+
+## Traceability
+
+Which phases cover which requirements. Updated during roadmap creation.
+
+| Requirement | Phase     | Status  |
+| ----------- | --------- | ------- |
+| AUTH-01     | Phase 1   | Pending |
+| AUTH-02     | Phase 1   | Pending |
+| AUTH-03     | Phase 1   | Pending |
+| AUTH-04     | Phase 1   | Pending |
+| [REQ-ID]    | Phase [N] | Pending |
+
+**Coverage:**
+- v1 requirements: [X] total
+- Mapped to phases: [Y]
+- Unmapped: [Z] ⚠️
+
+---
+*Requirements defined: [date]*
+*Last updated: [date] after [trigger]*
+```
+
+</template>
+
+<guidelines>
+
+**Requirement Format:**
+- ID: `[CATEGORY]-[NUMBER]` (AUTH-01, CONTENT-02, SOCIAL-03)
+- Description: User-centric, testable, atomic
+- Checkbox: Only for v1 requirements (v2 are not yet actionable)
+
+**Categories:**
+- Derive from research FEATURES.md categories
+- Keep consistent with domain conventions
+- Typical: Authentication, Content, Social, Notifications, Moderation, Payments, Admin
+
+**v1 vs v2:**
+- v1: Committed scope, will be in roadmap phases
+- v2: Acknowledged but deferred, not in current roadmap
+- Moving v2 → v1 requires roadmap update
+
+**Out of Scope:**
+- Explicit exclusions with reasoning
+- Prevents "why didn't you include X?" later
+- Anti-features from research belong here with warnings
+
+**Traceability:**
+- Empty initially, populated during roadmap creation
+- Each requirement maps to exactly one phase
+- Unmapped requirements = roadmap gap
+
+**Status Values:**
+- Pending: Not started
+- In Progress: Phase is active
+- Complete: Requirement verified
+- Blocked: Waiting on external factor
+
+</guidelines>
+
+<evolution>
+
+**After each phase completes:**
+1. Mark covered requirements as Complete
+2. Update traceability status
+3. Note any requirements that changed scope
+
+**After roadmap updates:**
+1. Verify all v1 requirements still mapped
+2. Add new requirements if scope expanded
+3. Move requirements to v2/out of scope if descoped
+
+**Requirement completion criteria:**
+- Requirement is "Complete" when:
+  - Feature is implemented
+  - Feature is verified (tests pass, manual check done)
+  - Feature is committed
+
+</evolution>
+
+<example>
+
+```markdown
+# Requirements: CommunityApp
+
+**Defined:** 2025-01-14
+**Core Value:** Users can share and discuss content with people who share their interests
+
+## v1 Requirements
+
+### Authentication
+
+- [ ] **AUTH-01**: User can sign up with email and password
+- [ ] **AUTH-02**: User receives email verification after signup
+- [ ] **AUTH-03**: User can reset password via email link
+- [ ] **AUTH-04**: User session persists across browser refresh
+
+### Profiles
+
+- [ ] **PROF-01**: User can create profile with display name
+- [ ] **PROF-02**: User can upload avatar image
+- [ ] **PROF-03**: User can write bio (max 500 chars)
+- [ ] **PROF-04**: User can view other users' profiles
+
+### Content
+
+- [ ] **CONT-01**: User can create text post
+- [ ] **CONT-02**: User can upload image with post
+- [ ] **CONT-03**: User can edit own posts
+- [ ] **CONT-04**: User can delete own posts
+- [ ] **CONT-05**: User can view feed of posts
+
+### Social
+
+- [ ] **SOCL-01**: User can follow other users
+- [ ] **SOCL-02**: User can unfollow users
+- [ ] **SOCL-03**: User can like posts
+- [ ] **SOCL-04**: User can comment on posts
+- [ ] **SOCL-05**: User can view activity feed (followed users' posts)
+
+## v2 Requirements
+
+### Notifications
+
+- **NOTF-01**: User receives in-app notifications
+- **NOTF-02**: User receives email for new followers
+- **NOTF-03**: User receives email for comments on own posts
+- **NOTF-04**: User can configure notification preferences
+
+### Moderation
+
+- **MODR-01**: User can report content
+- **MODR-02**: User can block other users
+- **MODR-03**: Admin can view reported content
+- **MODR-04**: Admin can remove content
+- **MODR-05**: Admin can ban users
+
+## Out of Scope
+
+| Feature        | Reason                                       |
+| -------------- | -------------------------------------------- |
+| Real-time chat | High complexity, not core to community value |
+| Video posts    | Storage/bandwidth costs, defer to v2+        |
+| OAuth login    | Email/password sufficient for v1             |
+| Mobile app     | Web-first, mobile later                      |
+
+## Traceability
+
+| Requirement | Phase   | Status  |
+| ----------- | ------- | ------- |
+| AUTH-01     | Phase 1 | Pending |
+| AUTH-02     | Phase 1 | Pending |
+| AUTH-03     | Phase 1 | Pending |
+| AUTH-04     | Phase 1 | Pending |
+| PROF-01     | Phase 2 | Pending |
+| PROF-02     | Phase 2 | Pending |
+| PROF-03     | Phase 2 | Pending |
+| PROF-04     | Phase 2 | Pending |
+| CONT-01     | Phase 3 | Pending |
+| CONT-02     | Phase 3 | Pending |
+| CONT-03     | Phase 3 | Pending |
+| CONT-04     | Phase 3 | Pending |
+| CONT-05     | Phase 3 | Pending |
+| SOCL-01     | Phase 4 | Pending |
+| SOCL-02     | Phase 4 | Pending |
+| SOCL-03     | Phase 4 | Pending |
+| SOCL-04     | Phase 4 | Pending |
+| SOCL-05     | Phase 4 | Pending |
+
+**Coverage:**
+- v1 requirements: 18 total
+- Mapped to phases: 18
+- Unmapped: 0 ✓
+
+---
+*Requirements defined: 2025-01-14*
+*Last updated: 2025-01-14 after initial definition*
+```
+
+</example>
--- a/.pi/gsd/templates/research-project/ARCHITECTURE.md
+++ b/.pi/gsd/templates/research-project/ARCHITECTURE.md
@@ -0,0 +1,204 @@
+# Architecture Research Template
+
+Template for `.planning/research/ARCHITECTURE.md` - system structure patterns for the project domain.
+
+<template>
+
+```markdown
+# Architecture Research
+
+**Domain:** [domain type]
+**Researched:** [date]
+**Confidence:** [HIGH/MEDIUM/LOW]
+
+## Standard Architecture
+
+### System Overview
+
+```
+┌─────────────────────────────────────────────────────────────┐
+│                        [Layer Name]                          │
+├─────────────────────────────────────────────────────────────┤
+│  ┌─────────┐  ┌─────────┐  ┌─────────┐  ┌─────────┐        │
+│  │ [Comp]  │  │ [Comp]  │  │ [Comp]  │  │ [Comp]  │        │
+│  └────┬────┘  └────┬────┘  └────┬────┘  └────┬────┘        │
+│       │            │            │            │              │
+├───────┴────────────┴────────────┴────────────┴──────────────┤
+│                        [Layer Name]                          │
+├─────────────────────────────────────────────────────────────┤
+│  ┌─────────────────────────────────────────────────────┐    │
+│  │                    [Component]                       │    │
+│  └─────────────────────────────────────────────────────┘    │
+├─────────────────────────────────────────────────────────────┤
+│                        [Layer Name]                          │
+│  ┌──────────┐  ┌──────────┐  ┌──────────┐                   │
+│  │ [Store]  │  │ [Store]  │  │ [Store]  │                   │
+│  └──────────┘  └──────────┘  └──────────┘                   │
+└─────────────────────────────────────────────────────────────┘
+```
+
+### Component Responsibilities
+
+| Component | Responsibility | Typical Implementation   |
+| --------- | -------------- | ------------------------ |
+| [name]    | [what it owns] | [how it's usually built] |
+| [name]    | [what it owns] | [how it's usually built] |
+| [name]    | [what it owns] | [how it's usually built] |
+
+## Recommended Project Structure
+
+```
+src/
+├── [folder]/           # [purpose]
+│   ├── [subfolder]/    # [purpose]
+│   └── [file].ts       # [purpose]
+├── [folder]/           # [purpose]
+│   ├── [subfolder]/    # [purpose]
+│   └── [file].ts       # [purpose]
+├── [folder]/           # [purpose]
+└── [folder]/           # [purpose]
+```
+
+### Structure Rationale
+
+- **[folder]/:** [why organized this way]
+- **[folder]/:** [why organized this way]
+
+## Architectural Patterns
+
+### Pattern 1: [Pattern Name]
+
+**What:** [description]
+**When to use:** [conditions]
+**Trade-offs:** [pros and cons]
+
+**Example:**
+```typescript
+// [Brief code example showing the pattern]
+```
+
+### Pattern 2: [Pattern Name]
+
+**What:** [description]
+**When to use:** [conditions]
+**Trade-offs:** [pros and cons]
+
+**Example:**
+```typescript
+// [Brief code example showing the pattern]
+```
+
+### Pattern 3: [Pattern Name]
+
+**What:** [description]
+**When to use:** [conditions]
+**Trade-offs:** [pros and cons]
+
+## Data Flow
+
+### Request Flow
+
+```
+[User Action]
+    ↓
+[Component] → [Handler] → [Service] → [Data Store]
+    ↓              ↓           ↓            ↓
+[Response] ← [Transform] ← [Query] ← [Database]
+```
+
+### State Management
+
+```
+[State Store]
+    ↓ (subscribe)
+[Components] ←→ [Actions] → [Reducers/Mutations] → [State Store]
+```
+
+### Key Data Flows
+
+1. **[Flow name]:** [description of how data moves]
+2. **[Flow name]:** [description of how data moves]
+
+## Scaling Considerations
+
+| Scale         | Architecture Adjustments                |
+| ------------- | --------------------------------------- |
+| 0-1k users    | [approach - usually monolith is fine]   |
+| 1k-100k users | [approach - what to optimize first]     |
+| 100k+ users   | [approach - when to consider splitting] |
+
+### Scaling Priorities
+
+1. **First bottleneck:** [what breaks first, how to fix]
+2. **Second bottleneck:** [what breaks next, how to fix]
+
+## Anti-Patterns
+
+### Anti-Pattern 1: [Name]
+
+**What people do:** [the mistake]
+**Why it's wrong:** [the problem it causes]
+**Do this instead:** [the correct approach]
+
+### Anti-Pattern 2: [Name]
+
+**What people do:** [the mistake]
+**Why it's wrong:** [the problem it causes]
+**Do this instead:** [the correct approach]
+
+## Integration Points
+
+### External Services
+
+| Service   | Integration Pattern | Notes     |
+| --------- | ------------------- | --------- |
+| [service] | [how to connect]    | [gotchas] |
+| [service] | [how to connect]    | [gotchas] |
+
+### Internal Boundaries
+
+| Boundary              | Communication       | Notes            |
+| --------------------- | ------------------- | ---------------- |
+| [module A ↔ module B] | [API/events/direct] | [considerations] |
+
+## Sources
+
+- [Architecture references]
+- [Official documentation]
+- [Case studies]
+
+---
+*Architecture research for: [domain]*
+*Researched: [date]*
+```
+
+</template>
+
+<guidelines>
+
+**System Overview:**
+- Use ASCII box-drawing diagrams for clarity (├── └── │ ─ for structure visualization only)
+- Show major components and their relationships
+- Don't over-detail - this is conceptual, not implementation
+
+**Project Structure:**
+- Be specific about folder organization
+- Explain the rationale for grouping
+- Match conventions of the chosen stack
+
+**Patterns:**
+- Include code examples where helpful
+- Explain trade-offs honestly
+- Note when patterns are overkill for small projects
+
+**Scaling Considerations:**
+- Be realistic - most projects don't need to scale to millions
+- Focus on "what breaks first" not theoretical limits
+- Avoid premature optimization recommendations
+
+**Anti-Patterns:**
+- Specific to this domain
+- Include what to do instead
+- Helps prevent common mistakes during implementation
+
+</guidelines>
--- a/.pi/gsd/templates/research-project/FEATURES.md
+++ b/.pi/gsd/templates/research-project/FEATURES.md
@@ -0,0 +1,147 @@
+# Features Research Template
+
+Template for `.planning/research/FEATURES.md` - feature landscape for the project domain.
+
+<template>
+
+```markdown
+# Feature Research
+
+**Domain:** [domain type]
+**Researched:** [date]
+**Confidence:** [HIGH/MEDIUM/LOW]
+
+## Feature Landscape
+
+### Table Stakes (Users Expect These)
+
+Features users assume exist. Missing these = product feels incomplete.
+
+| Feature   | Why Expected       | Complexity      | Notes                  |
+| --------- | ------------------ | --------------- | ---------------------- |
+| [feature] | [user expectation] | LOW/MEDIUM/HIGH | [implementation notes] |
+| [feature] | [user expectation] | LOW/MEDIUM/HIGH | [implementation notes] |
+| [feature] | [user expectation] | LOW/MEDIUM/HIGH | [implementation notes] |
+
+### Differentiators (Competitive Advantage)
+
+Features that set the product apart. Not required, but valuable.
+
+| Feature   | Value Proposition | Complexity      | Notes                  |
+| --------- | ----------------- | --------------- | ---------------------- |
+| [feature] | [why it matters]  | LOW/MEDIUM/HIGH | [implementation notes] |
+| [feature] | [why it matters]  | LOW/MEDIUM/HIGH | [implementation notes] |
+| [feature] | [why it matters]  | LOW/MEDIUM/HIGH | [implementation notes] |
+
+### Anti-Features (Commonly Requested, Often Problematic)
+
+Features that seem good but create problems.
+
+| Feature   | Why Requested    | Why Problematic   | Alternative       |
+| --------- | ---------------- | ----------------- | ----------------- |
+| [feature] | [surface appeal] | [actual problems] | [better approach] |
+| [feature] | [surface appeal] | [actual problems] | [better approach] |
+
+## Feature Dependencies
+
+```
+[Feature A]
+    └──requires──> [Feature B]
+                       └──requires──> [Feature C]
+
+[Feature D] ──enhances──> [Feature A]
+
+[Feature E] ──conflicts──> [Feature F]
+```
+
+### Dependency Notes
+
+- **[Feature A] requires [Feature B]:** [why the dependency exists]
+- **[Feature D] enhances [Feature A]:** [how they work together]
+- **[Feature E] conflicts with [Feature F]:** [why they're incompatible]
+
+## MVP Definition
+
+### Launch With (v1)
+
+Minimum viable product - what's needed to validate the concept.
+
+- [ ] [Feature] - [why essential]
+- [ ] [Feature] - [why essential]
+- [ ] [Feature] - [why essential]
+
+### Add After Validation (v1.x)
+
+Features to add once core is working.
+
+- [ ] [Feature] - [trigger for adding]
+- [ ] [Feature] - [trigger for adding]
+
+### Future Consideration (v2+)
+
+Features to defer until product-market fit is established.
+
+- [ ] [Feature] - [why defer]
+- [ ] [Feature] - [why defer]
+
+## Feature Prioritization Matrix
+
+| Feature   | User Value      | Implementation Cost | Priority |
+| --------- | --------------- | ------------------- | -------- |
+| [feature] | HIGH/MEDIUM/LOW | HIGH/MEDIUM/LOW     | P1/P2/P3 |
+| [feature] | HIGH/MEDIUM/LOW | HIGH/MEDIUM/LOW     | P1/P2/P3 |
+| [feature] | HIGH/MEDIUM/LOW | HIGH/MEDIUM/LOW     | P1/P2/P3 |
+
+**Priority key:**
+- P1: Must have for launch
+- P2: Should have, add when possible
+- P3: Nice to have, future consideration
+
+## Competitor Feature Analysis
+
+| Feature   | Competitor A     | Competitor B     | Our Approach |
+| --------- | ---------------- | ---------------- | ------------ |
+| [feature] | [how they do it] | [how they do it] | [our plan]   |
+| [feature] | [how they do it] | [how they do it] | [our plan]   |
+
+## Sources
+
+- [Competitor products analyzed]
+- [User research or feedback sources]
+- [Industry standards referenced]
+
+---
+*Feature research for: [domain]*
+*Researched: [date]*
+```
+
+</template>
+
+<guidelines>
+
+**Table Stakes:**
+- These are non-negotiable for launch
+- Users don't give credit for having them, but penalize for missing them
+- Example: A community platform without user profiles is broken
+
+**Differentiators:**
+- These are where you compete
+- Should align with the Core Value from PROJECT.md
+- Don't try to differentiate on everything
+
+**Anti-Features:**
+- Prevent scope creep by documenting what seems good but isn't
+- Include the alternative approach
+- Example: "Real-time everything" often creates complexity without value
+
+**Feature Dependencies:**
+- Critical for roadmap phase ordering
+- If A requires B, B must be in an earlier phase
+- Conflicts inform what NOT to combine in same phase
+
+**MVP Definition:**
+- Be ruthless about what's truly minimum
+- "Nice to have" is not MVP
+- Launch with less, validate, then expand
+
+</guidelines>
--- a/.pi/gsd/templates/research-project/PITFALLS.md
+++ b/.pi/gsd/templates/research-project/PITFALLS.md
@@ -0,0 +1,200 @@
+# Pitfalls Research Template
+
+Template for `.planning/research/PITFALLS.md` - common mistakes to avoid in the project domain.
+
+<template>
+
+```markdown
+# Pitfalls Research
+
+**Domain:** [domain type]
+**Researched:** [date]
+**Confidence:** [HIGH/MEDIUM/LOW]
+
+## Critical Pitfalls
+
+### Pitfall 1: [Name]
+
+**What goes wrong:**
+[Description of the failure mode]
+
+**Why it happens:**
+[Root cause - why developers make this mistake]
+
+**How to avoid:**
+[Specific prevention strategy]
+
+**Warning signs:**
+[How to detect this early before it becomes a problem]
+
+**Phase to address:**
+[Which roadmap phase should prevent this]
+
+---
+
+### Pitfall 2: [Name]
+
+**What goes wrong:**
+[Description of the failure mode]
+
+**Why it happens:**
+[Root cause - why developers make this mistake]
+
+**How to avoid:**
+[Specific prevention strategy]
+
+**Warning signs:**
+[How to detect this early before it becomes a problem]
+
+**Phase to address:**
+[Which roadmap phase should prevent this]
+
+---
+
+### Pitfall 3: [Name]
+
+**What goes wrong:**
+[Description of the failure mode]
+
+**Why it happens:**
+[Root cause - why developers make this mistake]
+
+**How to avoid:**
+[Specific prevention strategy]
+
+**Warning signs:**
+[How to detect this early before it becomes a problem]
+
+**Phase to address:**
+[Which roadmap phase should prevent this]
+
+---
+
+[Continue for all critical pitfalls...]
+
+## Technical Debt Patterns
+
+Shortcuts that seem reasonable but create long-term problems.
+
+| Shortcut   | Immediate Benefit | Long-term Cost | When Acceptable          |
+| ---------- | ----------------- | -------------- | ------------------------ |
+| [shortcut] | [benefit]         | [cost]         | [conditions, or "never"] |
+| [shortcut] | [benefit]         | [cost]         | [conditions, or "never"] |
+| [shortcut] | [benefit]         | [cost]         | [conditions, or "never"] |
+
+## Integration Gotchas
+
+Common mistakes when connecting to external services.
+
+| Integration | Common Mistake         | Correct Approach     |
+| ----------- | ---------------------- | -------------------- |
+| [service]   | [what people do wrong] | [what to do instead] |
+| [service]   | [what people do wrong] | [what to do instead] |
+| [service]   | [what people do wrong] | [what to do instead] |
+
+## Performance Traps
+
+Patterns that work at small scale but fail as usage grows.
+
+| Trap   | Symptoms         | Prevention     | When It Breaks    |
+| ------ | ---------------- | -------------- | ----------------- |
+| [trap] | [how you notice] | [how to avoid] | [scale threshold] |
+| [trap] | [how you notice] | [how to avoid] | [scale threshold] |
+| [trap] | [how you notice] | [how to avoid] | [scale threshold] |
+
+## Security Mistakes
+
+Domain-specific security issues beyond general web security.
+
+| Mistake   | Risk                | Prevention     |
+| --------- | ------------------- | -------------- |
+| [mistake] | [what could happen] | [how to avoid] |
+| [mistake] | [what could happen] | [how to avoid] |
+| [mistake] | [what could happen] | [how to avoid] |
+
+## UX Pitfalls
+
+Common user experience mistakes in this domain.
+
+| Pitfall   | User Impact        | Better Approach      |
+| --------- | ------------------ | -------------------- |
+| [pitfall] | [how users suffer] | [what to do instead] |
+| [pitfall] | [how users suffer] | [what to do instead] |
+| [pitfall] | [how users suffer] | [what to do instead] |
+
+## "Looks Done But Isn't" Checklist
+
+Things that appear complete but are missing critical pieces.
+
+- [ ] **[Feature]:** Often missing [thing] - verify [check]
+- [ ] **[Feature]:** Often missing [thing] - verify [check]
+- [ ] **[Feature]:** Often missing [thing] - verify [check]
+- [ ] **[Feature]:** Often missing [thing] - verify [check]
+
+## Recovery Strategies
+
+When pitfalls occur despite prevention, how to recover.
+
+| Pitfall   | Recovery Cost   | Recovery Steps |
+| --------- | --------------- | -------------- |
+| [pitfall] | LOW/MEDIUM/HIGH | [what to do]   |
+| [pitfall] | LOW/MEDIUM/HIGH | [what to do]   |
+| [pitfall] | LOW/MEDIUM/HIGH | [what to do]   |
+
+## Pitfall-to-Phase Mapping
+
+How roadmap phases should address these pitfalls.
+
+| Pitfall   | Prevention Phase | Verification                      |
+| --------- | ---------------- | --------------------------------- |
+| [pitfall] | Phase [X]        | [how to verify prevention worked] |
+| [pitfall] | Phase [X]        | [how to verify prevention worked] |
+| [pitfall] | Phase [X]        | [how to verify prevention worked] |
+
+## Sources
+
+- [Post-mortems referenced]
+- [Community discussions]
+- [Official "gotchas" documentation]
+- [Personal experience / known issues]
+
+---
+*Pitfalls research for: [domain]*
+*Researched: [date]*
+```
+
+</template>
+
+<guidelines>
+
+**Critical Pitfalls:**
+- Focus on domain-specific issues, not generic mistakes
+- Include warning signs - early detection prevents disasters
+- Link to specific phases - makes pitfalls actionable
+
+**Technical Debt:**
+- Be realistic - some shortcuts are acceptable
+- Note when shortcuts are "never acceptable" vs. "only in MVP"
+- Include the long-term cost to inform tradeoff decisions
+
+**Performance Traps:**
+- Include scale thresholds ("breaks at 10k users")
+- Focus on what's relevant for this project's expected scale
+- Don't over-engineer for hypothetical scale
+
+**Security Mistakes:**
+- Beyond OWASP basics - domain-specific issues
+- Example: Community platforms have different security concerns than e-commerce
+- Include risk level to prioritize
+
+**"Looks Done But Isn't":**
+- Checklist format for verification during execution
+- Common in demos vs. production
+- Prevents "it works on my machine" issues
+
+**Pitfall-to-Phase Mapping:**
+- Critical for roadmap creation
+- Each pitfall should map to a phase that prevents it
+- Informs phase ordering and success criteria
+
+</guidelines>
--- a/.pi/gsd/templates/research-project/STACK.md
+++ b/.pi/gsd/templates/research-project/STACK.md
@@ -0,0 +1,120 @@
+# Stack Research Template
+
+Template for `.planning/research/STACK.md` - recommended technologies for the project domain.
+
+<template>
+
+```markdown
+# Stack Research
+
+**Domain:** [domain type]
+**Researched:** [date]
+**Confidence:** [HIGH/MEDIUM/LOW]
+
+## Recommended Stack
+
+### Core Technologies
+
+| Technology | Version   | Purpose        | Why Recommended                      |
+| ---------- | --------- | -------------- | ------------------------------------ |
+| [name]     | [version] | [what it does] | [why experts use it for this domain] |
+| [name]     | [version] | [what it does] | [why experts use it for this domain] |
+| [name]     | [version] | [what it does] | [why experts use it for this domain] |
+
+### Supporting Libraries
+
+| Library | Version   | Purpose        | When to Use         |
+| ------- | --------- | -------------- | ------------------- |
+| [name]  | [version] | [what it does] | [specific use case] |
+| [name]  | [version] | [what it does] | [specific use case] |
+| [name]  | [version] | [what it does] | [specific use case] |
+
+### Development Tools
+
+| Tool   | Purpose        | Notes                |
+| ------ | -------------- | -------------------- |
+| [name] | [what it does] | [configuration tips] |
+| [name] | [what it does] | [configuration tips] |
+
+## Installation
+
+```bash
+# Core
+npm install [packages]
+
+# Supporting
+npm install [packages]
+
+# Dev dependencies
+npm install -D [packages]
+```
+
+## Alternatives Considered
+
+| Recommended  | Alternative    | When to Use Alternative                  |
+| ------------ | -------------- | ---------------------------------------- |
+| [our choice] | [other option] | [conditions where alternative is better] |
+| [our choice] | [other option] | [conditions where alternative is better] |
+
+## What NOT to Use
+
+| Avoid        | Why                | Use Instead               |
+| ------------ | ------------------ | ------------------------- |
+| [technology] | [specific problem] | [recommended alternative] |
+| [technology] | [specific problem] | [recommended alternative] |
+
+## Stack Patterns by Variant
+
+**If [condition]:**
+- Use [variation]
+- Because [reason]
+
+**If [condition]:**
+- Use [variation]
+- Because [reason]
+
+## Version Compatibility
+
+| Package A         | Compatible With   | Notes                 |
+| ----------------- | ----------------- | --------------------- |
+| [package@version] | [package@version] | [compatibility notes] |
+
+## Sources
+
+- [Context7 library ID] - [topics fetched]
+- [Official docs URL] - [what was verified]
+- [Other source] - [confidence level]
+
+---
+*Stack research for: [domain]*
+*Researched: [date]*
+```
+
+</template>
+
+<guidelines>
+
+**Core Technologies:**
+- Include specific version numbers
+- Explain why this is the standard choice, not just what it does
+- Focus on technologies that affect architecture decisions
+
+**Supporting Libraries:**
+- Include libraries commonly needed for this domain
+- Note when each is needed (not all projects need all libraries)
+
+**Alternatives:**
+- Don't just dismiss alternatives
+- Explain when alternatives make sense
+- Helps user make informed decisions if they disagree
+
+**What NOT to Use:**
+- Actively warn against outdated or problematic choices
+- Explain the specific problem, not just "it's old"
+- Provide the recommended alternative
+
+**Version Compatibility:**
+- Note any known compatibility issues
+- Critical for avoiding debugging time later
+
+</guidelines>
--- a/.pi/gsd/templates/research-project/SUMMARY.md
+++ b/.pi/gsd/templates/research-project/SUMMARY.md
@@ -0,0 +1,170 @@
+# Research Summary Template
+
+Template for `.planning/research/SUMMARY.md` - executive summary of project research with roadmap implications.
+
+<template>
+
+```markdown
+# Project Research Summary
+
+**Project:** [name from PROJECT.md]
+**Domain:** [inferred domain type]
+**Researched:** [date]
+**Confidence:** [HIGH/MEDIUM/LOW]
+
+## Executive Summary
+
+[2-3 paragraph overview of research findings]
+
+- What type of product this is and how experts build it
+- The recommended approach based on research
+- Key risks and how to mitigate them
+
+## Key Findings
+
+### Recommended Stack
+
+[Summary from STACK.md - 1-2 paragraphs]
+
+**Core technologies:**
+- [Technology]: [purpose] - [why recommended]
+- [Technology]: [purpose] - [why recommended]
+- [Technology]: [purpose] - [why recommended]
+
+### Expected Features
+
+[Summary from FEATURES.md]
+
+**Must have (table stakes):**
+- [Feature] - users expect this
+- [Feature] - users expect this
+
+**Should have (competitive):**
+- [Feature] - differentiator
+- [Feature] - differentiator
+
+**Defer (v2+):**
+- [Feature] - not essential for launch
+
+### Architecture Approach
+
+[Summary from ARCHITECTURE.md - 1 paragraph]
+
+**Major components:**
+1. [Component] - [responsibility]
+2. [Component] - [responsibility]
+3. [Component] - [responsibility]
+
+### Critical Pitfalls
+
+[Top 3-5 from PITFALLS.md]
+
+1. **[Pitfall]** - [how to avoid]
+2. **[Pitfall]** - [how to avoid]
+3. **[Pitfall]** - [how to avoid]
+
+## Implications for Roadmap
+
+Based on research, suggested phase structure:
+
+### Phase 1: [Name]
+**Rationale:** [why this comes first based on research]
+**Delivers:** [what this phase produces]
+**Addresses:** [features from FEATURES.md]
+**Avoids:** [pitfall from PITFALLS.md]
+
+### Phase 2: [Name]
+**Rationale:** [why this order]
+**Delivers:** [what this phase produces]
+**Uses:** [stack elements from STACK.md]
+**Implements:** [architecture component]
+
+### Phase 3: [Name]
+**Rationale:** [why this order]
+**Delivers:** [what this phase produces]
+
+[Continue for suggested phases...]
+
+### Phase Ordering Rationale
+
+- [Why this order based on dependencies discovered]
+- [Why this grouping based on architecture patterns]
+- [How this avoids pitfalls from research]
+
+### Research Flags
+
+Phases likely needing deeper research during planning:
+- **Phase [X]:** [reason - e.g., "complex integration, needs API research"]
+- **Phase [Y]:** [reason - e.g., "niche domain, sparse documentation"]
+
+Phases with standard patterns (skip research-phase):
+- **Phase [X]:** [reason - e.g., "well-documented, established patterns"]
+
+## Confidence Assessment
+
+| Area         | Confidence        | Notes    |
+| ------------ | ----------------- | -------- |
+| Stack        | [HIGH/MEDIUM/LOW] | [reason] |
+| Features     | [HIGH/MEDIUM/LOW] | [reason] |
+| Architecture | [HIGH/MEDIUM/LOW] | [reason] |
+| Pitfalls     | [HIGH/MEDIUM/LOW] | [reason] |
+
+**Overall confidence:** [HIGH/MEDIUM/LOW]
+
+### Gaps to Address
+
+[Any areas where research was inconclusive or needs validation during implementation]
+
+- [Gap]: [how to handle during planning/execution]
+- [Gap]: [how to handle during planning/execution]
+
+## Sources
+
+### Primary (HIGH confidence)
+- [Context7 library ID] - [topics]
+- [Official docs URL] - [what was checked]
+
+### Secondary (MEDIUM confidence)
+- [Source] - [finding]
+
+### Tertiary (LOW confidence)
+- [Source] - [finding, needs validation]
+
+---
+*Research completed: [date]*
+*Ready for roadmap: yes*
+```
+
+</template>
+
+<guidelines>
+
+**Executive Summary:**
+- Write for someone who will only read this section
+- Include the key recommendation and main risk
+- 2-3 paragraphs maximum
+
+**Key Findings:**
+- Summarize, don't duplicate full documents
+- Link to detailed docs (STACK.md, FEATURES.md, etc.)
+- Focus on what matters for roadmap decisions
+
+**Implications for Roadmap:**
+- This is the most important section
+- Directly informs roadmap creation
+- Be explicit about phase suggestions and rationale
+- Include research flags for each suggested phase
+
+**Confidence Assessment:**
+- Be honest about uncertainty
+- Note gaps that need resolution during planning
+- HIGH = verified with official sources
+- MEDIUM = community consensus, multiple sources agree
+- LOW = single source or inference
+
+**Integration with roadmap creation:**
+- This file is loaded as context during roadmap creation
+- Phase suggestions here become starting point for roadmap
+- Research flags inform phase planning
+
+</guidelines>
--- a/.pi/gsd/templates/research.md
+++ b/.pi/gsd/templates/research.md
@@ -0,0 +1,552 @@
+# Research Template
+
+Template for `.planning/phases/XX-name/{phase_num}-RESEARCH.md` - comprehensive ecosystem research before planning.
+
+**Purpose:** Document what the agent needs to know to implement a phase well - not just "which library" but "how do experts build this."
+
+---
+
+## File Template
+
+```markdown
+# Phase [X]: [Name] - Research
+
+**Researched:** [date]
+**Domain:** [primary technology/problem domain]
+**Confidence:** [HIGH/MEDIUM/LOW]
+
+<user_constraints>
+## User Constraints (from CONTEXT.md)
+
+**CRITICAL:** If CONTEXT.md exists from /gsd-discuss-phase, copy locked decisions here verbatim. These MUST be honored by the planner.
+
+### Locked Decisions
+[Copy from CONTEXT.md `## Decisions` section - these are NON-NEGOTIABLE]
+- [Decision 1]
+- [Decision 2]
+
+### the agent's Discretion
+[Copy from CONTEXT.md - areas where researcher/planner can choose]
+- [Area 1]
+- [Area 2]
+
+### Deferred Ideas (OUT OF SCOPE)
+[Copy from CONTEXT.md - do NOT research or plan these]
+- [Deferred 1]
+- [Deferred 2]
+
+**If no CONTEXT.md exists:** Write "No user constraints - all decisions at the agent's discretion"
+</user_constraints>
+
+<research_summary>
+## Summary
+
+[2-3 paragraph executive summary]
+- What was researched
+- What the standard approach is
+- Key recommendations
+
+**Primary recommendation:** [one-liner actionable guidance]
+</research_summary>
+
+<standard_stack>
+## Standard Stack
+
+The established libraries/tools for this domain:
+
+### Core
+| Library | Version | Purpose | Why Standard |
+|---------|---------|---------|--------------|
+| [name] | [ver] | [what it does] | [why experts use it] |
+| [name] | [ver] | [what it does] | [why experts use it] |
+
+### Supporting
+| Library | Version | Purpose | When to Use |
+|---------|---------|---------|-------------|
+| [name] | [ver] | [what it does] | [use case] |
+| [name] | [ver] | [what it does] | [use case] |
+
+### Alternatives Considered
+| Instead of | Could Use | Tradeoff |
+|------------|-----------|----------|
+| [standard] | [alternative] | [when alternative makes sense] |
+
+**Installation:**
+```bash
+npm install [packages]
+# or
+yarn add [packages]
+```
+</standard_stack>
+
+<architecture_patterns>
+## Architecture Patterns
+
+### Recommended Project Structure
+```
+src/
+├── [folder]/        # [purpose]
+├── [folder]/        # [purpose]
+└── [folder]/        # [purpose]
+```
+
+### Pattern 1: [Pattern Name]
+**What:** [description]
+**When to use:** [conditions]
+**Example:**
+```typescript
+// [code example from Context7/official docs]
+```
+
+### Pattern 2: [Pattern Name]
+**What:** [description]
+**When to use:** [conditions]
+**Example:**
+```typescript
+// [code example]
+```
+
+### Anti-Patterns to Avoid
+- **[Anti-pattern]:** [why it's bad, what to do instead]
+- **[Anti-pattern]:** [why it's bad, what to do instead]
+</architecture_patterns>
+
+<dont_hand_roll>
+## Don't Hand-Roll
+
+Problems that look simple but have existing solutions:
+
+| Problem | Don't Build | Use Instead | Why |
+|---------|-------------|-------------|-----|
+| [problem] | [what you'd build] | [library] | [edge cases, complexity] |
+| [problem] | [what you'd build] | [library] | [edge cases, complexity] |
+| [problem] | [what you'd build] | [library] | [edge cases, complexity] |
+
+**Key insight:** [why custom solutions are worse in this domain]
+</dont_hand_roll>
+
+<common_pitfalls>
+## Common Pitfalls
+
+### Pitfall 1: [Name]
+**What goes wrong:** [description]
+**Why it happens:** [root cause]
+**How to avoid:** [prevention strategy]
+**Warning signs:** [how to detect early]
+
+### Pitfall 2: [Name]
+**What goes wrong:** [description]
+**Why it happens:** [root cause]
+**How to avoid:** [prevention strategy]
+**Warning signs:** [how to detect early]
+
+### Pitfall 3: [Name]
+**What goes wrong:** [description]
+**Why it happens:** [root cause]
+**How to avoid:** [prevention strategy]
+**Warning signs:** [how to detect early]
+</common_pitfalls>
+
+<code_examples>
+## Code Examples
+
+Verified patterns from official sources:
+
+### [Common Operation 1]
+```typescript
+// Source: [Context7/official docs URL]
+[code]
+```
+
+### [Common Operation 2]
+```typescript
+// Source: [Context7/official docs URL]
+[code]
+```
+
+### [Common Operation 3]
+```typescript
+// Source: [Context7/official docs URL]
+[code]
+```
+</code_examples>
+
+<sota_updates>
+## State of the Art (2024-2025)
+
+What's changed recently:
+
+| Old Approach | Current Approach | When Changed | Impact |
+|--------------|------------------|--------------|--------|
+| [old] | [new] | [date/version] | [what it means for implementation] |
+
+**New tools/patterns to consider:**
+- [Tool/Pattern]: [what it enables, when to use]
+- [Tool/Pattern]: [what it enables, when to use]
+
+**Deprecated/outdated:**
+- [Thing]: [why it's outdated, what replaced it]
+</sota_updates>
+
+<open_questions>
+## Open Questions
+
+Things that couldn't be fully resolved:
+
+1. **[Question]**
+   - What we know: [partial info]
+   - What's unclear: [the gap]
+   - Recommendation: [how to handle during planning/execution]
+
+2. **[Question]**
+   - What we know: [partial info]
+   - What's unclear: [the gap]
+   - Recommendation: [how to handle]
+</open_questions>
+
+<sources>
+## Sources
+
+### Primary (HIGH confidence)
+- [Context7 library ID] - [topics fetched]
+- [Official docs URL] - [what was checked]
+
+### Secondary (MEDIUM confidence)
+- [WebSearch verified with official source] - [finding + verification]
+
+### Tertiary (LOW confidence - needs validation)
+- [WebSearch only] - [finding, marked for validation during implementation]
+</sources>
+
+<metadata>
+## Metadata
+
+**Research scope:**
+- Core technology: [what]
+- Ecosystem: [libraries explored]
+- Patterns: [patterns researched]
+- Pitfalls: [areas checked]
+
+**Confidence breakdown:**
+- Standard stack: [HIGH/MEDIUM/LOW] - [reason]
+- Architecture: [HIGH/MEDIUM/LOW] - [reason]
+- Pitfalls: [HIGH/MEDIUM/LOW] - [reason]
+- Code examples: [HIGH/MEDIUM/LOW] - [reason]
+
+**Research date:** [date]
+**Valid until:** [estimate - 30 days for stable tech, 7 days for fast-moving]
+</metadata>
+
+---
+
+*Phase: XX-name*
+*Research completed: [date]*
+*Ready for planning: [yes/no]*
+```
+
+---
+
+## Good Example
+
+```markdown
+# Phase 3: 3D City Driving - Research
+
+**Researched:** 2025-01-20
+**Domain:** Three.js 3D web game with driving mechanics
+**Confidence:** HIGH
+
+<research_summary>
+## Summary
+
+Researched the Three.js ecosystem for building a 3D city driving game. The standard approach uses Three.js with React Three Fiber for component architecture, Rapier for physics, and drei for common helpers.
+
+Key finding: Don't hand-roll physics or collision detection. Rapier (via @react-three/rapier) handles vehicle physics, terrain collision, and city object interactions efficiently. Custom physics code leads to bugs and performance issues.
+
+**Primary recommendation:** Use R3F + Rapier + drei stack. Start with vehicle controller from drei, add Rapier vehicle physics, build city with instanced meshes for performance.
+</research_summary>
+
+<standard_stack>
+## Standard Stack
+
+### Core
+| Library | Version | Purpose | Why Standard |
+|---------|---------|---------|--------------|
+| three | 0.160.0 | 3D rendering | The standard for web 3D |
+| @react-three/fiber | 8.15.0 | React renderer for Three.js | Declarative 3D, better DX |
+| @react-three/drei | 9.92.0 | Helpers and abstractions | Solves common problems |
+| @react-three/rapier | 1.2.1 | Physics engine bindings | Best physics for R3F |
+
+### Supporting
+| Library | Version | Purpose | When to Use |
+|---------|---------|---------|-------------|
+| @react-three/postprocessing | 2.16.0 | Visual effects | Bloom, DOF, motion blur |
+| leva | 0.9.35 | Debug UI | Tweaking parameters |
+| zustand | 4.4.7 | State management | Game state, UI state |
+| use-sound | 4.0.1 | Audio | Engine sounds, ambient |
+
+### Alternatives Considered
+| Instead of | Could Use | Tradeoff |
+|------------|-----------|----------|
+| Rapier | Cannon.js | Cannon simpler but less performant for vehicles |
+| R3F | Vanilla Three | Vanilla if no React, but R3F DX is much better |
+| drei | Custom helpers | drei is battle-tested, don't reinvent |
+
+**Installation:**
+```bash
+npm install three @react-three/fiber @react-three/drei @react-three/rapier zustand
+```
+</standard_stack>
+
+<architecture_patterns>
+## Architecture Patterns
+
+### Recommended Project Structure
+```
+src/
+├── components/
+│   ├── Vehicle/          # Player car with physics
+│   ├── City/             # City generation and buildings
+│   ├── Road/             # Road network
+│   └── Environment/      # Sky, lighting, fog
+├── hooks/
+│   ├── useVehicleControls.ts
+│   └── useGameState.ts
+├── stores/
+│   └── gameStore.ts      # Zustand state
+└── utils/
+    └── cityGenerator.ts  # Procedural generation helpers
+```
+
+### Pattern 1: Vehicle with Rapier Physics
+**What:** Use RigidBody with vehicle-specific settings, not custom physics
+**When to use:** Any ground vehicle
+**Example:**
+```typescript
+// Source: @react-three/rapier docs
+import { RigidBody, useRapier } from '@react-three/rapier'
+
+function Vehicle() {
+  const rigidBody = useRef()
+
+  return (
+    <RigidBody
+      ref={rigidBody}
+      type="dynamic"
+      colliders="hull"
+      mass={1500}
+      linearDamping={0.5}
+      angularDamping={0.5}
+    >
+      <mesh>
+        <boxGeometry args={[2, 1, 4]} />
+        <meshStandardMaterial />
+      </mesh>
+    </RigidBody>
+  )
+}
+```
+
+### Pattern 2: Instanced Meshes for City
+**What:** Use InstancedMesh for repeated objects (buildings, trees, props)
+**When to use:** >100 similar objects
+**Example:**
+```typescript
+// Source: drei docs
+import { Instances, Instance } from '@react-three/drei'
+
+function Buildings({ positions }) {
+  return (
+    <Instances limit={1000}>
+      <boxGeometry />
+      <meshStandardMaterial />
+      {positions.map((pos, i) => (
+        <Instance key={i} position={pos} scale={[1, Math.random() * 5 + 1, 1]} />
+      ))}
+    </Instances>
+  )
+}
+```
+
+### Anti-Patterns to Avoid
+- **Creating meshes in render loop:** Create once, update transforms only
+- **Not using InstancedMesh:** Individual meshes for buildings kills performance
+- **Custom physics math:** Rapier handles it better, every time
+</architecture_patterns>
+
+<dont_hand_roll>
+## Don't Hand-Roll
+
+| Problem | Don't Build | Use Instead | Why |
+|---------|-------------|-------------|-----|
+| Vehicle physics | Custom velocity/acceleration | Rapier RigidBody | Wheel friction, suspension, collisions are complex |
+| Collision detection | Raycasting everything | Rapier colliders | Performance, edge cases, tunneling |
+| Camera follow | Manual lerp | drei CameraControls or custom with useFrame | Smooth interpolation, bounds |
+| City generation | Pure random placement | Grid-based with noise for variation | Random looks wrong, grid is predictable |
+| LOD | Manual distance checks | drei <Detailed> | Handles transitions, hysteresis |
+
+**Key insight:** 3D game development has 40+ years of solved problems. Rapier implements proper physics simulation. drei implements proper 3D helpers. Fighting these leads to bugs that look like "game feel" issues but are actually physics edge cases.
+</dont_hand_roll>
+
+<common_pitfalls>
+## Common Pitfalls
+
+### Pitfall 1: Physics Tunneling
+**What goes wrong:** Fast objects pass through walls
+**Why it happens:** Default physics step too large for velocity
+**How to avoid:** Use CCD (Continuous Collision Detection) in Rapier
+**Warning signs:** Objects randomly appearing outside buildings
+
+### Pitfall 2: Performance Death by Draw Calls
+**What goes wrong:** Game stutters with many buildings
+**Why it happens:** Each mesh = 1 draw call, hundreds of buildings = hundreds of calls
+**How to avoid:** InstancedMesh for similar objects, merge static geometry
+**Warning signs:** GPU bound, low FPS despite simple scene
+
+### Pitfall 3: Vehicle "Floaty" Feel
+**What goes wrong:** Car doesn't feel grounded
+**Why it happens:** Missing proper wheel/suspension simulation
+**How to avoid:** Use Rapier vehicle controller or tune mass/damping carefully
+**Warning signs:** Car bounces oddly, doesn't grip corners
+</common_pitfalls>
+
+<code_examples>
+## Code Examples
+
+### Basic R3F + Rapier Setup
+```typescript
+// Source: @react-three/rapier getting started
+import { Canvas } from '@react-three/fiber'
+import { Physics } from '@react-three/rapier'
+
+function Game() {
+  return (
+    <Canvas>
+      <Physics gravity={[0, -9.81, 0]}>
+        <Vehicle />
+        <City />
+        <Ground />
+      </Physics>
+    </Canvas>
+  )
+}
+```
+
+### Vehicle Controls Hook
+```typescript
+// Source: Community pattern, verified with drei docs
+import { useFrame } from '@react-three/fiber'
+import { useKeyboardControls } from '@react-three/drei'
+
+function useVehicleControls(rigidBodyRef) {
+  const [, getKeys] = useKeyboardControls()
+
+  useFrame(() => {
+    const { forward, back, left, right } = getKeys()
+    const body = rigidBodyRef.current
+    if (!body) return
+
+    const impulse = { x: 0, y: 0, z: 0 }
+    if (forward) impulse.z -= 10
+    if (back) impulse.z += 5
+
+    body.applyImpulse(impulse, true)
+
+    if (left) body.applyTorqueImpulse({ x: 0, y: 2, z: 0 }, true)
+    if (right) body.applyTorqueImpulse({ x: 0, y: -2, z: 0 }, true)
+  })
+}
+```
+</code_examples>
+
+<sota_updates>
+## State of the Art (2024-2025)
+
+| Old Approach | Current Approach | When Changed | Impact |
+|--------------|------------------|--------------|--------|
+| cannon-es | Rapier | 2023 | Rapier is faster, better maintained |
+| vanilla Three.js | React Three Fiber | 2020+ | R3F is now standard for React apps |
+| Manual InstancedMesh | drei <Instances> | 2022 | Simpler API, handles updates |
+
+**New tools/patterns to consider:**
+- **WebGPU:** Coming but not production-ready for games yet (2025)
+- **drei Gltf helpers:** <useGLTF.preload> for loading screens
+
+**Deprecated/outdated:**
+- **cannon.js (original):** Use cannon-es fork or better, Rapier
+- **Manual raycasting for physics:** Just use Rapier colliders
+</sota_updates>
+
+<sources>
+## Sources
+
+### Primary (HIGH confidence)
+- /pmndrs/react-three-fiber - getting started, hooks, performance
+- /pmndrs/drei - instances, controls, helpers
+- /dimforge/rapier-js - physics setup, vehicle physics
+
+### Secondary (MEDIUM confidence)
+- Three.js discourse "city driving game" threads - verified patterns against docs
+- R3F examples repository - verified code works
+
+### Tertiary (LOW confidence - needs validation)
+- None - all findings verified
+</sources>
+
+<metadata>
+## Metadata
+
+**Research scope:**
+- Core technology: Three.js + React Three Fiber
+- Ecosystem: Rapier, drei, zustand
+- Patterns: Vehicle physics, instancing, city generation
+- Pitfalls: Performance, physics, feel
+
+**Confidence breakdown:**
+- Standard stack: HIGH - verified with Context7, widely used
+- Architecture: HIGH - from official examples
+- Pitfalls: HIGH - documented in discourse, verified in docs
+- Code examples: HIGH - from Context7/official sources
+
+**Research date:** 2025-01-20
+**Valid until:** 2025-02-20 (30 days - R3F ecosystem stable)
+</metadata>
+
+---
+
+*Phase: 03-city-driving*
+*Research completed: 2025-01-20*
+*Ready for planning: yes*
+```
+
+---
+
+## Guidelines
+
+**When to create:**
+- Before planning phases in niche/complex domains
+- When the agent's training data is likely stale or sparse
+- When "how do experts do this" matters more than "which library"
+
+**Structure:**
+- Use XML tags for section markers (matches GSD templates)
+- Seven core sections: summary, standard_stack, architecture_patterns, dont_hand_roll, common_pitfalls, code_examples, sources
+- All sections required (drives comprehensive research)
+
+**Content quality:**
+- Standard stack: Specific versions, not just names
+- Architecture: Include actual code examples from authoritative sources
+- Don't hand-roll: Be explicit about what problems to NOT solve yourself
+- Pitfalls: Include warning signs, not just "don't do this"
+- Sources: Mark confidence levels honestly
+
+**Integration with planning:**
+- RESEARCH.md loaded as @context reference in PLAN.md
+- Standard stack informs library choices
+- Don't hand-roll prevents custom solutions
+- Pitfalls inform verification criteria
+- Code examples can be referenced in task actions
+
+**After creation:**
+- File lives in phase directory: `.planning/phases/XX-name/{phase_num}-RESEARCH.md`
+- Referenced during planning workflow
+- plan-phase loads it automatically when present
--- a/.pi/gsd/templates/retrospective.md
+++ b/.pi/gsd/templates/retrospective.md
@@ -0,0 +1,54 @@
+# Project Retrospective
+
+*A living document updated after each milestone. Lessons feed forward into future planning.*
+
+## Milestone: v{version} - {name}
+
+**Shipped:** {date}
+**Phases:** {count} | **Plans:** {count} | **Sessions:** {count}
+
+### What Was Built
+- {Key deliverable 1}
+- {Key deliverable 2}
+- {Key deliverable 3}
+
+### What Worked
+- {Efficiency win or successful pattern}
+- {What went smoothly}
+
+### What Was Inefficient
+- {Missed opportunity}
+- {What took longer than expected}
+
+### Patterns Established
+- {New pattern or convention that should persist}
+
+### Key Lessons
+1. {Specific, actionable lesson}
+2. {Another lesson}
+
+### Cost Observations
+- Model mix: {X}% opus, {Y}% sonnet, {Z}% haiku
+- Sessions: {count}
+- Notable: {efficiency observation}
+
+---
+
+## Cross-Milestone Trends
+
+### Process Evolution
+
+| Milestone | Sessions | Phases | Key Change                |
+| --------- | -------- | ------ | ------------------------- |
+| v{X}      | {N}      | {M}    | {What changed in process} |
+
+### Cumulative Quality
+
+| Milestone | Tests | Coverage | Zero-Dep Additions |
+| --------- | ----- | -------- | ------------------ |
+| v{X}      | {N}   | {Y}%     | {count}            |
+
+### Top Lessons (Verified Across Milestones)
+
+1. {Lesson verified by multiple milestones}
+2. {Another cross-validated lesson}
--- a/.pi/gsd/templates/roadmap.md
+++ b/.pi/gsd/templates/roadmap.md
@@ -0,0 +1,202 @@
+# Roadmap Template
+
+Template for `.planning/ROADMAP.md`.
+
+## Initial Roadmap (v1.0 Greenfield)
+
+```markdown
+# Roadmap: [Project Name]
+
+## Overview
+
+[One paragraph describing the journey from start to finish]
+
+## Phases
+
+**Phase Numbering:**
+- Integer phases (1, 2, 3): Planned milestone work
+- Decimal phases (2.1, 2.2): Urgent insertions (marked with INSERTED)
+
+Decimal phases appear between their surrounding integers in numeric order.
+
+- [ ] **Phase 1: [Name]** - [One-line description]
+- [ ] **Phase 2: [Name]** - [One-line description]
+- [ ] **Phase 3: [Name]** - [One-line description]
+- [ ] **Phase 4: [Name]** - [One-line description]
+
+## Phase Details
+
+### Phase 1: [Name]
+**Goal**: [What this phase delivers]
+**Depends on**: Nothing (first phase)
+**Requirements**: [REQ-01, REQ-02, REQ-03]  <!-- brackets optional, parser handles both formats -->
+**Success Criteria** (what must be TRUE):
+  1. [Observable behavior from user perspective]
+  2. [Observable behavior from user perspective]
+  3. [Observable behavior from user perspective]
+**Plans**: [Number of plans, e.g., "3 plans" or "TBD"]
+
+Plans:
+- [ ] 01-01: [Brief description of first plan]
+- [ ] 01-02: [Brief description of second plan]
+- [ ] 01-03: [Brief description of third plan]
+
+### Phase 2: [Name]
+**Goal**: [What this phase delivers]
+**Depends on**: Phase 1
+**Requirements**: [REQ-04, REQ-05]
+**Success Criteria** (what must be TRUE):
+  1. [Observable behavior from user perspective]
+  2. [Observable behavior from user perspective]
+**Plans**: [Number of plans]
+
+Plans:
+- [ ] 02-01: [Brief description]
+- [ ] 02-02: [Brief description]
+
+### Phase 2.1: Critical Fix (INSERTED)
+**Goal**: [Urgent work inserted between phases]
+**Depends on**: Phase 2
+**Success Criteria** (what must be TRUE):
+  1. [What the fix achieves]
+**Plans**: 1 plan
+
+Plans:
+- [ ] 02.1-01: [Description]
+
+### Phase 3: [Name]
+**Goal**: [What this phase delivers]
+**Depends on**: Phase 2
+**Requirements**: [REQ-06, REQ-07, REQ-08]
+**Success Criteria** (what must be TRUE):
+  1. [Observable behavior from user perspective]
+  2. [Observable behavior from user perspective]
+  3. [Observable behavior from user perspective]
+**Plans**: [Number of plans]
+
+Plans:
+- [ ] 03-01: [Brief description]
+- [ ] 03-02: [Brief description]
+
+### Phase 4: [Name]
+**Goal**: [What this phase delivers]
+**Depends on**: Phase 3
+**Requirements**: [REQ-09, REQ-10]
+**Success Criteria** (what must be TRUE):
+  1. [Observable behavior from user perspective]
+  2. [Observable behavior from user perspective]
+**Plans**: [Number of plans]
+
+Plans:
+- [ ] 04-01: [Brief description]
+
+## Progress
+
+**Execution Order:**
+Phases execute in numeric order: 2 → 2.1 → 2.2 → 3 → 3.1 → 4
+
+| Phase | Plans Complete | Status | Completed |
+|-------|----------------|--------|-----------|
+| 1. [Name] | 0/3 | Not started | - |
+| 2. [Name] | 0/2 | Not started | - |
+| 3. [Name] | 0/2 | Not started | - |
+| 4. [Name] | 0/1 | Not started | - |
+```
+
+<guidelines>
+**Initial planning (v1.0):**
+- Phase count depends on granularity setting (coarse: 3-5, standard: 5-8, fine: 8-12)
+- Each phase delivers something coherent
+- Phases can have 1+ plans (split if >3 tasks or multiple subsystems)
+- Plans use naming: {phase}-{plan}-PLAN.md (e.g., 01-02-PLAN.md)
+- No time estimates (this isn't enterprise PM)
+- Progress table updated by execute workflow
+- Plan count can be "TBD" initially, refined during planning
+
+**Success criteria:**
+- 2-5 observable behaviors per phase (from user's perspective)
+- Cross-checked against requirements during roadmap creation
+- Flow downstream to `must_haves` in plan-phase
+- Verified by verify-phase after execution
+- Format: "User can [action]" or "[Thing] works/exists"
+
+**After milestones ship:**
+- Collapse completed milestones in `<details>` tags
+- Add new milestone sections for upcoming work
+- Keep continuous phase numbering (never restart at 01)
+</guidelines>
+
+<status_values>
+- `Not started` - Haven't begun
+- `In progress` - Currently working
+- `Complete` - Done (add completion date)
+- `Deferred` - Pushed to later (with reason)
+</status_values>
+
+## Milestone-Grouped Roadmap (After v1.0 Ships)
+
+After completing first milestone, reorganize with milestone groupings:
+
+```markdown
+# Roadmap: [Project Name]
+
+## Milestones
+
+- ✅ **v1.0 MVP** - Phases 1-4 (shipped YYYY-MM-DD)
+- 🚧 **v1.1 [Name]** - Phases 5-6 (in progress)
+- 📋 **v2.0 [Name]** - Phases 7-10 (planned)
+
+## Phases
+
+<details>
+<summary>✅ v1.0 MVP (Phases 1-4) - SHIPPED YYYY-MM-DD</summary>
+
+### Phase 1: [Name]
+**Goal**: [What this phase delivers]
+**Plans**: 3 plans
+
+Plans:
+- [x] 01-01: [Brief description]
+- [x] 01-02: [Brief description]
+- [x] 01-03: [Brief description]
+
+[... remaining v1.0 phases ...]
+
+</details>
+
+### 🚧 v1.1 [Name] (In Progress)
+
+**Milestone Goal:** [What v1.1 delivers]
+
+#### Phase 5: [Name]
+**Goal**: [What this phase delivers]
+**Depends on**: Phase 4
+**Plans**: 2 plans
+
+Plans:
+- [ ] 05-01: [Brief description]
+- [ ] 05-02: [Brief description]
+
+[... remaining v1.1 phases ...]
+
+### 📋 v2.0 [Name] (Planned)
+
+**Milestone Goal:** [What v2.0 delivers]
+
+[... v2.0 phases ...]
+
+## Progress
+
+| Phase | Milestone | Plans Complete | Status | Completed |
+|-------|-----------|----------------|--------|-----------|
+| 1. Foundation | v1.0 | 3/3 | Complete | YYYY-MM-DD |
+| 2. Features | v1.0 | 2/2 | Complete | YYYY-MM-DD |
+| 5. Security | v1.1 | 0/2 | Not started | - |
+```
+
+**Notes:**
+- Milestone emoji: ✅ shipped, 🚧 in progress, 📋 planned
+- Completed milestones collapsed in `<details>` for readability
+- Current/future milestones expanded
+- Continuous phase numbering (01-99)
+- Progress table includes milestone column
--- a/.pi/gsd/templates/state.md
+++ b/.pi/gsd/templates/state.md
@@ -0,0 +1,176 @@
+# State Template
+
+Template for `.planning/STATE.md` - the project's living memory.
+
+---
+
+## File Template
+
+```markdown
+# Project State
+
+## Project Reference
+
+See: .planning/PROJECT.md (updated [date])
+
+**Core value:** [One-liner from PROJECT.md Core Value section]
+**Current focus:** [Current phase name]
+
+## Current Position
+
+Phase: [X] of [Y] ([Phase name])
+Plan: [A] of [B] in current phase
+Status: [Ready to plan / Planning / Ready to execute / In progress / Phase complete]
+Last activity: [YYYY-MM-DD] - [What happened]
+
+Progress: [░░░░░░░░░░] 0%
+
+## Performance Metrics
+
+**Velocity:**
+- Total plans completed: [N]
+- Average duration: [X] min
+- Total execution time: [X.X] hours
+
+**By Phase:**
+
+| Phase | Plans | Total | Avg/Plan |
+| ----- | ----- | ----- | -------- |
+| -     | -     | -     | -        |
+
+**Recent Trend:**
+- Last 5 plans: [durations]
+- Trend: [Improving / Stable / Degrading]
+
+*Updated after each plan completion*
+
+## Accumulated Context
+
+### Decisions
+
+Decisions are logged in PROJECT.md Key Decisions table.
+Recent decisions affecting current work:
+
+- [Phase X]: [Decision summary]
+- [Phase Y]: [Decision summary]
+
+### Pending Todos
+
+[From .planning/todos/pending/ - ideas captured during sessions]
+
+None yet.
+
+### Blockers/Concerns
+
+[Issues that affect future work]
+
+None yet.
+
+## Session Continuity
+
+Last session: [YYYY-MM-DD HH:MM]
+Stopped at: [Description of last completed action]
+Resume file: [Path to .continue-here*.md if exists, otherwise "None"]
+```
+
+<purpose>
+
+STATE.md is the project's short-term memory spanning all phases and sessions.
+
+**Problem it solves:** Information is captured in summaries, issues, and decisions but not systematically consumed. Sessions start without context.
+
+**Solution:** A single, small file that's:
+- Read first in every workflow
+- Updated after every significant action
+- Contains digest of accumulated context
+- Enables instant session restoration
+
+</purpose>
+
+<lifecycle>
+
+**Creation:** After ROADMAP.md is created (during init)
+- Reference PROJECT.md (read it for current context)
+- Initialize empty accumulated context sections
+- Set position to "Phase 1 ready to plan"
+
+**Reading:** First step of every workflow
+- progress: Present status to user
+- plan: Inform planning decisions
+- execute: Know current position
+- transition: Know what's complete
+
+**Writing:** After every significant action
+- execute: After SUMMARY.md created
+  - Update position (phase, plan, status)
+  - Note new decisions (detail in PROJECT.md)
+  - Add blockers/concerns
+- transition: After phase marked complete
+  - Update progress bar
+  - Clear resolved blockers
+  - Refresh Project Reference date
+
+</lifecycle>
+
+<sections>
+
+### Project Reference
+Points to PROJECT.md for full context. Includes:
+- Core value (the ONE thing that matters)
+- Current focus (which phase)
+- Last update date (triggers re-read if stale)
+
+the agent reads PROJECT.md directly for requirements, constraints, and decisions.
+
+### Current Position
+Where we are right now:
+- Phase X of Y - which phase
+- Plan A of B - which plan within phase
+- Status - current state
+- Last activity - what happened most recently
+- Progress bar - visual indicator of overall completion
+
+Progress calculation: (completed plans) / (total plans across all phases) × 100%
+
+### Performance Metrics
+Track velocity to understand execution patterns:
+- Total plans completed
+- Average duration per plan
+- Per-phase breakdown
+- Recent trend (improving/stable/degrading)
+
+Updated after each plan completion.
+
+### Accumulated Context
+
+**Decisions:** Reference to PROJECT.md Key Decisions table, plus recent decisions summary for quick access. Full decision log lives in PROJECT.md.
+
+**Pending Todos:** Ideas captured via /gsd-add-todo
+- Count of pending todos
+- Reference to .planning/todos/pending/
+- Brief list if few, count if many (e.g., "5 pending todos - see /gsd-check-todos")
+
+**Blockers/Concerns:** From "Next Phase Readiness" sections
+- Issues that affect future work
+- Prefix with originating phase
+- Cleared when addressed
+
+### Session Continuity
+Enables instant resumption:
+- When was last session
+- What was last completed
+- Is there a .continue-here file to resume from
+
+</sections>
+
+<size_constraint>
+
+Keep STATE.md under 100 lines.
+
+It's a DIGEST, not an archive. If accumulated context grows too large:
+- Keep only 3-5 recent decisions in summary (full log in PROJECT.md)
+- Keep only active blockers, remove resolved ones
+
+The goal is "read once, know where we are" - if it's too long, that fails.
+
+</size_constraint>
--- a/.pi/gsd/templates/summary-complex.md
+++ b/.pi/gsd/templates/summary-complex.md
@@ -0,0 +1,59 @@
+---
+phase: XX-name
+plan: YY
+subsystem: [primary category]
+tags: [searchable tech]
+requires:
+  - phase: [prior phase]
+    provides: [what that phase built]
+provides:
+  - [bullet list of what was built/delivered]
+affects: [list of phase names or keywords]
+tech-stack:
+  added: [libraries/tools]
+  patterns: [architectural/code patterns]
+key-files:
+  created: [important files created]
+  modified: [important files modified]
+key-decisions:
+  - "Decision 1"
+patterns-established:
+  - "Pattern 1: description"
+duration: Xmin
+completed: YYYY-MM-DD
+---
+
+# Phase [X]: [Name] Summary (Complex)
+
+**[Substantive one-liner describing outcome]**
+
+## Performance
+- **Duration:** [time]
+- **Tasks:** [count completed]
+- **Files modified:** [count]
+
+## Accomplishments
+- [Key outcome 1]
+- [Key outcome 2]
+
+## Task Commits
+1. **Task 1: [task name]** - `hash`
+2. **Task 2: [task name]** - `hash`
+3. **Task 3: [task name]** - `hash`
+
+## Files Created/Modified
+- `path/to/file.ts` - What it does
+- `path/to/another.ts` - What it does
+
+## Decisions Made
+[Key decisions with brief rationale]
+
+## Deviations from Plan (Auto-fixed)
+[Detailed auto-fix records per GSD deviation rules]
+
+## Issues Encountered
+[Problems during planned work and resolutions]
+
+## Next Phase Readiness
+[What's ready for next phase]
+[Blockers or concerns]
--- a/.pi/gsd/templates/summary-minimal.md
+++ b/.pi/gsd/templates/summary-minimal.md
@@ -0,0 +1,41 @@
+---
+phase: XX-name
+plan: YY
+subsystem: [primary category]
+tags: [searchable tech]
+provides:
+  - [bullet list of what was built/delivered]
+affects: [list of phase names or keywords]
+tech-stack:
+  added: [libraries/tools]
+  patterns: [architectural/code patterns]
+key-files:
+  created: [important files created]
+  modified: [important files modified]
+key-decisions: []
+duration: Xmin
+completed: YYYY-MM-DD
+---
+
+# Phase [X]: [Name] Summary (Minimal)
+
+**[Substantive one-liner describing outcome]**
+
+## Performance
+- **Duration:** [time]
+- **Tasks:** [count]
+- **Files modified:** [count]
+
+## Accomplishments
+- [Most important outcome]
+- [Second key accomplishment]
+
+## Task Commits
+1. **Task 1: [task name]** - `hash`
+2. **Task 2: [task name]** - `hash`
+
+## Files Created/Modified
+- `path/to/file.ts` - What it does
+
+## Next Phase Readiness
+[Ready for next phase]
--- a/.pi/gsd/templates/summary-standard.md
+++ b/.pi/gsd/templates/summary-standard.md
@@ -0,0 +1,48 @@
+---
+phase: XX-name
+plan: YY
+subsystem: [primary category]
+tags: [searchable tech]
+provides:
+  - [bullet list of what was built/delivered]
+affects: [list of phase names or keywords]
+tech-stack:
+  added: [libraries/tools]
+  patterns: [architectural/code patterns]
+key-files:
+  created: [important files created]
+  modified: [important files modified]
+key-decisions:
+  - "Decision 1"
+duration: Xmin
+completed: YYYY-MM-DD
+---
+
+# Phase [X]: [Name] Summary
+
+**[Substantive one-liner describing outcome]**
+
+## Performance
+- **Duration:** [time]
+- **Tasks:** [count completed]
+- **Files modified:** [count]
+
+## Accomplishments
+- [Key outcome 1]
+- [Key outcome 2]
+
+## Task Commits
+1. **Task 1: [task name]** - `hash`
+2. **Task 2: [task name]** - `hash`
+3. **Task 3: [task name]** - `hash`
+
+## Files Created/Modified
+- `path/to/file.ts` - What it does
+- `path/to/another.ts` - What it does
+
+## Decisions & Deviations
+[Key decisions or "None - followed plan as specified"]
+[Minor deviations if any, or "None"]
+
+## Next Phase Readiness
+[What's ready for next phase]
--- a/.pi/gsd/templates/summary.md
+++ b/.pi/gsd/templates/summary.md
@@ -0,0 +1,248 @@
+# Summary Template
+
+Template for `.planning/phases/XX-name/{phase}-{plan}-SUMMARY.md` - phase completion documentation.
+
+---
+
+## File Template
+
+```markdown
+---
+phase: XX-name
+plan: YY
+subsystem: [primary category: auth, payments, ui, api, database, infra, testing, etc.]
+tags: [searchable tech: jwt, stripe, react, postgres, prisma]
+
+# Dependency graph
+requires:
+  - phase: [prior phase this depends on]
+    provides: [what that phase built that this uses]
+provides:
+  - [bullet list of what this phase built/delivered]
+affects: [list of phase names or keywords that will need this context]
+
+# Tech tracking
+tech-stack:
+  added: [libraries/tools added in this phase]
+  patterns: [architectural/code patterns established]
+
+key-files:
+  created: [important files created]
+  modified: [important files modified]
+
+key-decisions:
+  - "Decision 1"
+  - "Decision 2"
+
+patterns-established:
+  - "Pattern 1: description"
+  - "Pattern 2: description"
+
+requirements-completed: []  # REQUIRED - Copy ALL requirement IDs from this plan's `requirements` frontmatter field.
+
+# Metrics
+duration: Xmin
+completed: YYYY-MM-DD
+---
+
+# Phase [X]: [Name] Summary
+
+**[Substantive one-liner describing outcome - NOT "phase complete" or "implementation finished"]**
+
+## Performance
+
+- **Duration:** [time] (e.g., 23 min, 1h 15m)
+- **Started:** [ISO timestamp]
+- **Completed:** [ISO timestamp]
+- **Tasks:** [count completed]
+- **Files modified:** [count]
+
+## Accomplishments
+- [Most important outcome]
+- [Second key accomplishment]
+- [Third if applicable]
+
+## Task Commits
+
+Each task was committed atomically:
+
+1. **Task 1: [task name]** - `abc123f` (feat/fix/test/refactor)
+2. **Task 2: [task name]** - `def456g` (feat/fix/test/refactor)
+3. **Task 3: [task name]** - `hij789k` (feat/fix/test/refactor)
+
+**Plan metadata:** `lmn012o` (docs: complete plan)
+
+_Note: TDD tasks may have multiple commits (test → feat → refactor)_
+
+## Files Created/Modified
+- `path/to/file.ts` - What it does
+- `path/to/another.ts` - What it does
+
+## Decisions Made
+[Key decisions with brief rationale, or "None - followed plan as specified"]
+
+## Deviations from Plan
+
+[If no deviations: "None - plan executed exactly as written"]
+
+[If deviations occurred:]
+
+### Auto-fixed Issues
+
+**1. [Rule X - Category] Brief description**
+- **Found during:** Task [N] ([task name])
+- **Issue:** [What was wrong]
+- **Fix:** [What was done]
+- **Files modified:** [file paths]
+- **Verification:** [How it was verified]
+- **Committed in:** [hash] (part of task commit)
+
+[... repeat for each auto-fix ...]
+
+---
+
+**Total deviations:** [N] auto-fixed ([breakdown by rule])
+**Impact on plan:** [Brief assessment - e.g., "All auto-fixes necessary for correctness/security. No scope creep."]
+
+## Issues Encountered
+[Problems and how they were resolved, or "None"]
+
+[Note: "Deviations from Plan" documents unplanned work that was handled automatically via deviation rules. "Issues Encountered" documents problems during planned work that required problem-solving.]
+
+## User Setup Required
+
+[If USER-SETUP.md was generated:]
+**External services require manual configuration.** See [{phase}-USER-SETUP.md](./{phase}-USER-SETUP.md) for:
+- Environment variables to add
+- Dashboard configuration steps
+- Verification commands
+
+[If no USER-SETUP.md:]
+None - no external service configuration required.
+
+## Next Phase Readiness
+[What's ready for next phase]
+[Any blockers or concerns]
+
+---
+*Phase: XX-name*
+*Completed: [date]*
+```
+
+<frontmatter_guidance>
+**Purpose:** Enable automatic context assembly via dependency graph. Frontmatter makes summary metadata machine-readable so plan-phase can scan all summaries quickly and select relevant ones based on dependencies.
+
+**Fast scanning:** Frontmatter is first ~25 lines, cheap to scan across all summaries without reading full content.
+
+**Dependency graph:** `requires`/`provides`/`affects` create explicit links between phases, enabling transitive closure for context selection.
+
+**Subsystem:** Primary categorization (auth, payments, ui, api, database, infra, testing) for detecting related phases.
+
+**Tags:** Searchable technical keywords (libraries, frameworks, tools) for tech stack awareness.
+
+**Key-files:** Important files for @context references in PLAN.md.
+
+**Patterns:** Established conventions future phases should maintain.
+
+**Population:** Frontmatter is populated during summary creation in execute-plan.md. See `<step name="create_summary">` for field-by-field guidance.
+</frontmatter_guidance>
+
+<one_liner_rules>
+The one-liner MUST be substantive:
+
+**Good:**
+- "JWT auth with refresh rotation using jose library"
+- "Prisma schema with User, Session, and Product models"
+- "Dashboard with real-time metrics via Server-Sent Events"
+
+**Bad:**
+- "Phase complete"
+- "Authentication implemented"
+- "Foundation finished"
+- "All tasks done"
+
+The one-liner should tell someone what actually shipped.
+</one_liner_rules>
+
+<example>
+```markdown
+# Phase 1: Foundation Summary
+
+**JWT auth with refresh rotation using jose library, Prisma User model, and protected API middleware**
+
+## Performance
+
+- **Duration:** 28 min
+- **Started:** 2025-01-15T14:22:10Z
+- **Completed:** 2025-01-15T14:50:33Z
+- **Tasks:** 5
+- **Files modified:** 8
+
+## Accomplishments
+- User model with email/password auth
+- Login/logout endpoints with httpOnly JWT cookies
+- Protected route middleware checking token validity
+- Refresh token rotation on each request
+
+## Files Created/Modified
+- `prisma/schema.prisma` - User and Session models
+- `src/app/api/auth/login/route.ts` - Login endpoint
+- `src/app/api/auth/logout/route.ts` - Logout endpoint
+- `src/middleware.ts` - Protected route checks
+- `src/lib/auth.ts` - JWT helpers using jose
+
+## Decisions Made
+- Used jose instead of jsonwebtoken (ESM-native, Edge-compatible)
+- 15-min access tokens with 7-day refresh tokens
+- Storing refresh tokens in database for revocation capability
+
+## Deviations from Plan
+
+### Auto-fixed Issues
+
+**1. [Rule 2 - Missing Critical] Added password hashing with bcrypt**
+- **Found during:** Task 2 (Login endpoint implementation)
+- **Issue:** Plan didn't specify password hashing - storing plaintext would be critical security flaw
+- **Fix:** Added bcrypt hashing on registration, comparison on login with salt rounds 10
+- **Files modified:** src/app/api/auth/login/route.ts, src/lib/auth.ts
+- **Verification:** Password hash test passes, plaintext never stored
+- **Committed in:** abc123f (Task 2 commit)
+
+**2. [Rule 3 - Blocking] Installed missing jose dependency**
+- **Found during:** Task 4 (JWT token generation)
+- **Issue:** jose package not in package.json, import failing
+- **Fix:** Ran `npm install jose`
+- **Files modified:** package.json, package-lock.json
+- **Verification:** Import succeeds, build passes
+- **Committed in:** def456g (Task 4 commit)
+
+---
+
+**Total deviations:** 2 auto-fixed (1 missing critical, 1 blocking)
+**Impact on plan:** Both auto-fixes essential for security and functionality. No scope creep.
+
+## Issues Encountered
+- jsonwebtoken CommonJS import failed in Edge runtime - switched to jose (planned library change, worked as expected)
+
+## Next Phase Readiness
+- Auth foundation complete, ready for feature development
+- User registration endpoint needed before public launch
+
+---
+*Phase: 01-foundation*
+*Completed: 2025-01-15*
+```
+</example>
+
+<guidelines>
+**Frontmatter:** MANDATORY - complete all fields. Enables automatic context assembly for future planning.
+
+**One-liner:** Must be substantive. "JWT auth with refresh rotation using jose library" not "Authentication implemented".
+
+**Decisions section:**
+- Key decisions made during execution with rationale
+- Extracted to STATE.md accumulated context
+- Use "None - followed plan as specified" if no deviations
+
+**After creation:** STATE.md updated with position, decisions, issues.
+</guidelines>
--- a/.pi/gsd/templates/user-profile.md
+++ b/.pi/gsd/templates/user-profile.md
@@ -0,0 +1,146 @@
+# Developer Profile
+
+> This profile was generated from session analysis. It contains behavioral directives
+> for the agent to follow when working with this developer. HIGH confidence dimensions
+> should be acted on directly. LOW confidence dimensions should be approached with
+> hedging ("Based on your profile, I'll try X -- let me know if that's off").
+
+**Generated:** {{generated_at}}
+**Source:** {{data_source}}
+**Projects Analyzed:** {{projects_list}}
+**Messages Analyzed:** {{message_count}}
+
+---
+
+## Quick Reference
+
+{{summary_instructions}}
+
+---
+
+## Communication Style
+
+**Rating:** {{communication_style.rating}} | **Confidence:** {{communication_style.confidence}}
+
+**Directive:** {{communication_style.claude_instruction}}
+
+{{communication_style.summary}}
+
+**Evidence:**
+
+{{communication_style.evidence}}
+
+---
+
+## Decision Speed
+
+**Rating:** {{decision_speed.rating}} | **Confidence:** {{decision_speed.confidence}}
+
+**Directive:** {{decision_speed.claude_instruction}}
+
+{{decision_speed.summary}}
+
+**Evidence:**
+
+{{decision_speed.evidence}}
+
+---
+
+## Explanation Depth
+
+**Rating:** {{explanation_depth.rating}} | **Confidence:** {{explanation_depth.confidence}}
+
+**Directive:** {{explanation_depth.claude_instruction}}
+
+{{explanation_depth.summary}}
+
+**Evidence:**
+
+{{explanation_depth.evidence}}
+
+---
+
+## Debugging Approach
+
+**Rating:** {{debugging_approach.rating}} | **Confidence:** {{debugging_approach.confidence}}
+
+**Directive:** {{debugging_approach.claude_instruction}}
+
+{{debugging_approach.summary}}
+
+**Evidence:**
+
+{{debugging_approach.evidence}}
+
+---
+
+## UX Philosophy
+
+**Rating:** {{ux_philosophy.rating}} | **Confidence:** {{ux_philosophy.confidence}}
+
+**Directive:** {{ux_philosophy.claude_instruction}}
+
+{{ux_philosophy.summary}}
+
+**Evidence:**
+
+{{ux_philosophy.evidence}}
+
+---
+
+## Vendor Philosophy
+
+**Rating:** {{vendor_philosophy.rating}} | **Confidence:** {{vendor_philosophy.confidence}}
+
+**Directive:** {{vendor_philosophy.claude_instruction}}
+
+{{vendor_philosophy.summary}}
+
+**Evidence:**
+
+{{vendor_philosophy.evidence}}
+
+---
+
+## Frustration Triggers
+
+**Rating:** {{frustration_triggers.rating}} | **Confidence:** {{frustration_triggers.confidence}}
+
+**Directive:** {{frustration_triggers.claude_instruction}}
+
+{{frustration_triggers.summary}}
+
+**Evidence:**
+
+{{frustration_triggers.evidence}}
+
+---
+
+## Learning Style
+
+**Rating:** {{learning_style.rating}} | **Confidence:** {{learning_style.confidence}}
+
+**Directive:** {{learning_style.claude_instruction}}
+
+{{learning_style.summary}}
+
+**Evidence:**
+
+{{learning_style.evidence}}
+
+---
+
+## Profile Metadata
+
+| Field | Value |
+|-------|-------|
+| Profile Version | {{profile_version}} |
+| Generated | {{generated_at}} |
+| Source | {{data_source}} |
+| Projects | {{projects_count}} |
+| Messages | {{message_count}} |
+| Dimensions Scored | {{dimensions_scored}}/8 |
+| High Confidence | {{high_confidence_count}} |
+| Medium Confidence | {{medium_confidence_count}} |
+| Low Confidence | {{low_confidence_count}} |
+| Sensitive Content Excluded | {{sensitive_excluded_summary}} |
--- a/.pi/gsd/templates/user-setup.md
+++ b/.pi/gsd/templates/user-setup.md
@@ -0,0 +1,311 @@
+# User Setup Template
+
+Template for `.planning/phases/XX-name/{phase}-USER-SETUP.md` - human-required configuration that the agent cannot automate.
+
+**Purpose:** Document setup tasks that literally require human action - account creation, dashboard configuration, secret retrieval. the agent automates everything possible; this file captures only what remains.
+
+---
+
+## File Template
+
+```markdown
+# Phase {X}: User Setup Required
+
+**Generated:** [YYYY-MM-DD]
+**Phase:** {phase-name}
+**Status:** Incomplete
+
+Complete these items for the integration to function. the agent automated everything possible; these items require human access to external dashboards/accounts.
+
+## Environment Variables
+
+| Status | Variable | Source | Add to |
+|--------|----------|--------|--------|
+| [ ] | `ENV_VAR_NAME` | [Service Dashboard → Path → To → Value] | `.env.local` |
+| [ ] | `ANOTHER_VAR` | [Service Dashboard → Path → To → Value] | `.env.local` |
+
+## Account Setup
+
+[Only if new account creation is required]
+
+- [ ] **Create [Service] account**
+  - URL: [signup URL]
+  - Skip if: Already have account
+
+## Dashboard Configuration
+
+[Only if dashboard configuration is required]
+
+- [ ] **[Configuration task]**
+  - Location: [Service Dashboard → Path → To → Setting]
+  - Set to: [Required value or configuration]
+  - Notes: [Any important details]
+
+## Verification
+
+After completing setup, verify with:
+
+```bash
+# [Verification commands]
+```
+
+Expected results:
+- [What success looks like]
+
+---
+
+**Once all items complete:** Mark status as "Complete" at top of file.
+```
+
+---
+
+## When to Generate
+
+Generate `{phase}-USER-SETUP.md` when plan frontmatter contains `user_setup` field.
+
+**Trigger:** `user_setup` exists in PLAN.md frontmatter and has items.
+
+**Location:** Same directory as PLAN.md and SUMMARY.md.
+
+**Timing:** Generated during execute-plan.md after tasks complete, before SUMMARY.md creation.
+
+---
+
+## Frontmatter Schema
+
+In PLAN.md, `user_setup` declares human-required configuration:
+
+```yaml
+user_setup:
+  - service: stripe
+    why: "Payment processing requires API keys"
+    env_vars:
+      - name: STRIPE_SECRET_KEY
+        source: "Stripe Dashboard → Developers → API keys → Secret key"
+      - name: STRIPE_WEBHOOK_SECRET
+        source: "Stripe Dashboard → Developers → Webhooks → Signing secret"
+    dashboard_config:
+      - task: "Create webhook endpoint"
+        location: "Stripe Dashboard → Developers → Webhooks → Add endpoint"
+        details: "URL: https://[your-domain]/api/webhooks/stripe, Events: checkout.session.completed, customer.subscription.*"
+    local_dev:
+      - "Run: stripe listen --forward-to localhost:3000/api/webhooks/stripe"
+      - "Use the webhook secret from CLI output for local testing"
+```
+
+---
+
+## The Automation-First Rule
+
+**USER-SETUP.md contains ONLY what the agent literally cannot do.**
+
+| the agent CAN Do (not in USER-SETUP) | the agent CANNOT Do (→ USER-SETUP) |
+|-----------------------------------|--------------------------------|
+| `npm install stripe` | Create Stripe account |
+| Write webhook handler code | Get API keys from dashboard |
+| Create `.env.local` file structure | Copy actual secret values |
+| Run `stripe listen` | Authenticate Stripe CLI (browser OAuth) |
+| Configure package.json | Access external service dashboards |
+| Write any code | Retrieve secrets from third-party systems |
+
+**The test:** "Does this require a human in a browser, accessing an account the agent doesn't have credentials for?"
+- Yes → USER-SETUP.md
+- No → the agent does it automatically
+
+---
+
+## Service-Specific Examples
+
+<stripe_example>
+```markdown
+# Phase 10: User Setup Required
+
+**Generated:** 2025-01-14
+**Phase:** 10-monetization
+**Status:** Incomplete
+
+Complete these items for Stripe integration to function.
+
+## Environment Variables
+
+| Status | Variable | Source | Add to |
+|--------|----------|--------|--------|
+| [ ] | `STRIPE_SECRET_KEY` | Stripe Dashboard → Developers → API keys → Secret key | `.env.local` |
+| [ ] | `NEXT_PUBLIC_STRIPE_PUBLISHABLE_KEY` | Stripe Dashboard → Developers → API keys → Publishable key | `.env.local` |
+| [ ] | `STRIPE_WEBHOOK_SECRET` | Stripe Dashboard → Developers → Webhooks → [endpoint] → Signing secret | `.env.local` |
+
+## Account Setup
+
+- [ ] **Create Stripe account** (if needed)
+  - URL: https://dashboard.stripe.com/register
+  - Skip if: Already have Stripe account
+
+## Dashboard Configuration
+
+- [ ] **Create webhook endpoint**
+  - Location: Stripe Dashboard → Developers → Webhooks → Add endpoint
+  - Endpoint URL: `https://[your-domain]/api/webhooks/stripe`
+  - Events to send:
+    - `checkout.session.completed`
+    - `customer.subscription.created`
+    - `customer.subscription.updated`
+    - `customer.subscription.deleted`
+
+- [ ] **Create products and prices** (if using subscription tiers)
+  - Location: Stripe Dashboard → Products → Add product
+  - Create each subscription tier
+  - Copy Price IDs to:
+    - `STRIPE_STARTER_PRICE_ID`
+    - `STRIPE_PRO_PRICE_ID`
+
+## Local Development
+
+For local webhook testing:
+```bash
+stripe listen --forward-to localhost:3000/api/webhooks/stripe
+```
+Use the webhook signing secret from CLI output (starts with `whsec_`).
+
+## Verification
+
+After completing setup:
+
+```bash
+# Check env vars are set
+grep STRIPE .env.local
+
+# Verify build passes
+npm run build
+
+# Test webhook endpoint (should return 400 bad signature, not 500 crash)
+curl -X POST http://localhost:3000/api/webhooks/stripe \
+  -H "Content-Type: application/json" \
+  -d '{}'
+```
+
+Expected: Build passes, webhook returns 400 (signature validation working).
+
+---
+
+**Once all items complete:** Mark status as "Complete" at top of file.
+```
+</stripe_example>
+
+<supabase_example>
+```markdown
+# Phase 2: User Setup Required
+
+**Generated:** 2025-01-14
+**Phase:** 02-authentication
+**Status:** Incomplete
+
+Complete these items for Supabase Auth to function.
+
+## Environment Variables
+
+| Status | Variable | Source | Add to |
+|--------|----------|--------|--------|
+| [ ] | `NEXT_PUBLIC_SUPABASE_URL` | Supabase Dashboard → Settings → API → Project URL | `.env.local` |
+| [ ] | `NEXT_PUBLIC_SUPABASE_ANON_KEY` | Supabase Dashboard → Settings → API → anon public | `.env.local` |
+| [ ] | `SUPABASE_SERVICE_ROLE_KEY` | Supabase Dashboard → Settings → API → service_role | `.env.local` |
+
+## Account Setup
+
+- [ ] **Create Supabase project**
+  - URL: https://supabase.com/dashboard/new
+  - Skip if: Already have project for this app
+
+## Dashboard Configuration
+
+- [ ] **Enable Email Auth**
+  - Location: Supabase Dashboard → Authentication → Providers
+  - Enable: Email provider
+  - Configure: Confirm email (on/off based on preference)
+
+- [ ] **Configure OAuth providers** (if using social login)
+  - Location: Supabase Dashboard → Authentication → Providers
+  - For Google: Add Client ID and Secret from Google Cloud Console
+  - For GitHub: Add Client ID and Secret from GitHub OAuth Apps
+
+## Verification
+
+After completing setup:
+
+```bash
+# Check env vars
+grep SUPABASE .env.local
+
+# Verify connection (run in project directory)
+npx supabase status
+```
+
+---
+
+**Once all items complete:** Mark status as "Complete" at top of file.
+```
+</supabase_example>
+
+<sendgrid_example>
+```markdown
+# Phase 5: User Setup Required
+
+**Generated:** 2025-01-14
+**Phase:** 05-notifications
+**Status:** Incomplete
+
+Complete these items for SendGrid email to function.
+
+## Environment Variables
+
+| Status | Variable | Source | Add to |
+|--------|----------|--------|--------|
+| [ ] | `SENDGRID_API_KEY` | SendGrid Dashboard → Settings → API Keys → Create API Key | `.env.local` |
+| [ ] | `SENDGRID_FROM_EMAIL` | Your verified sender email address | `.env.local` |
+
+## Account Setup
+
+- [ ] **Create SendGrid account**
+  - URL: https://signup.sendgrid.com/
+  - Skip if: Already have account
+
+## Dashboard Configuration
+
+- [ ] **Verify sender identity**
+  - Location: SendGrid Dashboard → Settings → Sender Authentication
+  - Option 1: Single Sender Verification (quick, for dev)
+  - Option 2: Domain Authentication (production)
+
+- [ ] **Create API Key**
+  - Location: SendGrid Dashboard → Settings → API Keys → Create API Key
+  - Permission: Restricted Access → Mail Send (Full Access)
+  - Copy key immediately (shown only once)
+
+## Verification
+
+After completing setup:
+
+```bash
+# Check env var
+grep SENDGRID .env.local
+
+# Test email sending (replace with your test email)
+curl -X POST http://localhost:3000/api/test-email \
+  -H "Content-Type: application/json" \
+  -d '{"to": "your@email.com"}'
+```
+
+---
+
+**Once all items complete:** Mark status as "Complete" at top of file.
+```
+</sendgrid_example>
+
+---
+
+## Guidelines
+
+**Never include:** Actual secret values. Steps the agent can automate (package installs, code changes).
+
+**Naming:** `{phase}-USER-SETUP.md` matches the phase number pattern.
+**Status tracking:** User marks checkboxes and updates status line when complete.
+**Searchability:** `grep -r "USER-SETUP" .planning/` finds all phases with user requirements.
--- a/.pi/gsd/templates/verification-report.md
+++ b/.pi/gsd/templates/verification-report.md
@@ -0,0 +1,322 @@
+# Verification Report Template
+
+Template for `.planning/phases/XX-name/{phase_num}-VERIFICATION.md` - phase goal verification results.
+
+---
+
+## File Template
+
+```markdown
+---
+phase: XX-name
+verified: YYYY-MM-DDTHH:MM:SSZ
+status: passed | gaps_found | human_needed
+score: N/M must-haves verified
+---
+
+# Phase {X}: {Name} Verification Report
+
+**Phase Goal:** {goal from ROADMAP.md}
+**Verified:** {timestamp}
+**Status:** {passed | gaps_found | human_needed}
+
+## Goal Achievement
+
+### Observable Truths
+
+| #   | Truth                   | Status      | Evidence            |
+| --- | ----------------------- | ----------- | ------------------- |
+| 1   | {truth from must_haves} | ✓ VERIFIED  | {what confirmed it} |
+| 2   | {truth from must_haves} | ✗ FAILED    | {what's wrong}      |
+| 3   | {truth from must_haves} | ? UNCERTAIN | {why can't verify}  |
+
+**Score:** {N}/{M} truths verified
+
+### Required Artifacts
+
+| Artifact                    | Expected               | Status                 | Details                                       |
+| --------------------------- | ---------------------- | ---------------------- | --------------------------------------------- |
+| `src/components/Chat.tsx`   | Message list component | ✓ EXISTS + SUBSTANTIVE | Exports ChatList, renders Message[], no stubs |
+| `src/app/api/chat/route.ts` | Message CRUD           | ✗ STUB                 | File exists but POST returns placeholder      |
+| `prisma/schema.prisma`      | Message model          | ✓ EXISTS + SUBSTANTIVE | Model defined with all fields                 |
+
+**Artifacts:** {N}/{M} verified
+
+### Key Link Verification
+
+| From           | To             | Via                   | Status      | Details                                              |
+| -------------- | -------------- | --------------------- | ----------- | ---------------------------------------------------- |
+| Chat.tsx       | /api/chat      | fetch in useEffect    | ✓ WIRED     | Line 23: `fetch('/api/chat')` with response handling |
+| ChatInput      | /api/chat POST | onSubmit handler      | ✗ NOT WIRED | onSubmit only calls console.log                      |
+| /api/chat POST | database       | prisma.message.create | ✗ NOT WIRED | Returns hardcoded response, no DB call               |
+
+**Wiring:** {N}/{M} connections verified
+
+## Requirements Coverage
+
+| Requirement             | Status        | Blocking Issue                          |
+| ----------------------- | ------------- | --------------------------------------- |
+| {REQ-01}: {description} | ✓ SATISFIED   | -                                       |
+| {REQ-02}: {description} | ✗ BLOCKED     | API route is stub                       |
+| {REQ-03}: {description} | ? NEEDS HUMAN | Can't verify WebSocket programmatically |
+
+**Coverage:** {N}/{M} requirements satisfied
+
+## Anti-Patterns Found
+
+| File                      | Line | Pattern                         | Severity  | Impact                      |
+| ------------------------- | ---- | ------------------------------- | --------- | --------------------------- |
+| src/app/api/chat/route.ts | 12   | `// TODO: implement`            | ⚠️ Warning | Indicates incomplete        |
+| src/components/Chat.tsx   | 45   | `return <div>Placeholder</div>` | 🛑 Blocker | Renders no content          |
+| src/hooks/useChat.ts      | -    | File missing                    | 🛑 Blocker | Expected hook doesn't exist |
+
+**Anti-patterns:** {N} found ({blockers} blockers, {warnings} warnings)
+
+## Human Verification Required
+
+{If no human verification needed:}
+None - all verifiable items checked programmatically.
+
+{If human verification needed:}
+
+### 1. {Test Name}
+**Test:** {What to do}
+**Expected:** {What should happen}
+**Why human:** {Why can't verify programmatically}
+
+### 2. {Test Name}
+**Test:** {What to do}
+**Expected:** {What should happen}
+**Why human:** {Why can't verify programmatically}
+
+## Gaps Summary
+
+{If no gaps:}
+**No gaps found.** Phase goal achieved. Ready to proceed.
+
+{If gaps found:}
+
+### Critical Gaps (Block Progress)
+
+1. **{Gap name}**
+   - Missing: {what's missing}
+   - Impact: {why this blocks the goal}
+   - Fix: {what needs to happen}
+
+2. **{Gap name}**
+   - Missing: {what's missing}
+   - Impact: {why this blocks the goal}
+   - Fix: {what needs to happen}
+
+### Non-Critical Gaps (Can Defer)
+
+1. **{Gap name}**
+   - Issue: {what's wrong}
+   - Impact: {limited impact because...}
+   - Recommendation: {fix now or defer}
+
+## Recommended Fix Plans
+
+{If gaps found, generate fix plan recommendations:}
+
+### {phase}-{next}-PLAN.md: {Fix Name}
+
+**Objective:** {What this fixes}
+
+**Tasks:**
+1. {Task to fix gap 1}
+2. {Task to fix gap 2}
+3. {Verification task}
+
+**Estimated scope:** {Small / Medium}
+
+---
+
+### {phase}-{next+1}-PLAN.md: {Fix Name}
+
+**Objective:** {What this fixes}
+
+**Tasks:**
+1. {Task}
+2. {Task}
+
+**Estimated scope:** {Small / Medium}
+
+---
+
+## Verification Metadata
+
+**Verification approach:** Goal-backward (derived from phase goal)
+**Must-haves source:** {PLAN.md frontmatter | derived from ROADMAP.md goal}
+**Automated checks:** {N} passed, {M} failed
+**Human checks required:** {N}
+**Total verification time:** {duration}
+
+---
+*Verified: {timestamp}*
+*Verifier: the agent (subagent)*
+```
+
+---
+
+## Guidelines
+
+**Status values:**
+- `passed` - All must-haves verified, no blockers
+- `gaps_found` - One or more critical gaps found
+- `human_needed` - Automated checks pass but human verification required
+
+**Evidence types:**
+- For EXISTS: "File at path, exports X"
+- For SUBSTANTIVE: "N lines, has patterns X, Y, Z"
+- For WIRED: "Line N: code that connects A to B"
+- For FAILED: "Missing because X" or "Stub because Y"
+
+**Severity levels:**
+- 🛑 Blocker: Prevents goal achievement, must fix
+- ⚠️ Warning: Indicates incomplete but doesn't block
+- ℹ️ Info: Notable but not problematic
+
+**Fix plan generation:**
+- Only generate if gaps_found
+- Group related fixes into single plans
+- Keep to 2-3 tasks per plan
+- Include verification task in each plan
+
+---
+
+## Example
+
+```markdown
+---
+phase: 03-chat
+verified: 2025-01-15T14:30:00Z
+status: gaps_found
+score: 2/5 must-haves verified
+---
+
+# Phase 3: Chat Interface Verification Report
+
+**Phase Goal:** Working chat interface where users can send and receive messages
+**Verified:** 2025-01-15T14:30:00Z
+**Status:** gaps_found
+
+## Goal Achievement
+
+### Observable Truths
+
+| #   | Truth                           | Status      | Evidence                                        |
+| --- | ------------------------------- | ----------- | ----------------------------------------------- |
+| 1   | User can see existing messages  | ✗ FAILED    | Component renders placeholder, not message data |
+| 2   | User can type a message         | ✓ VERIFIED  | Input field exists with onChange handler        |
+| 3   | User can send a message         | ✗ FAILED    | onSubmit handler is console.log only            |
+| 4   | Sent message appears in list    | ✗ FAILED    | No state update after send                      |
+| 5   | Messages persist across refresh | ? UNCERTAIN | Can't verify - send doesn't work                |
+
+**Score:** 1/5 truths verified
+
+### Required Artifacts
+
+| Artifact                       | Expected               | Status                 | Details                                           |
+| ------------------------------ | ---------------------- | ---------------------- | ------------------------------------------------- |
+| `src/components/Chat.tsx`      | Message list component | ✗ STUB                 | Returns `<div>Chat will be here</div>`            |
+| `src/components/ChatInput.tsx` | Message input          | ✓ EXISTS + SUBSTANTIVE | Form with input, submit button, handlers          |
+| `src/app/api/chat/route.ts`    | Message CRUD           | ✗ STUB                 | GET returns [], POST returns { ok: true }         |
+| `prisma/schema.prisma`         | Message model          | ✓ EXISTS + SUBSTANTIVE | Message model with id, content, userId, createdAt |
+
+**Artifacts:** 2/4 verified
+
+### Key Link Verification
+
+| From           | To             | Via                     | Status      | Details                          |
+| -------------- | -------------- | ----------------------- | ----------- | -------------------------------- |
+| Chat.tsx       | /api/chat GET  | fetch                   | ✗ NOT WIRED | No fetch call in component       |
+| ChatInput      | /api/chat POST | onSubmit                | ✗ NOT WIRED | Handler only logs, doesn't fetch |
+| /api/chat GET  | database       | prisma.message.findMany | ✗ NOT WIRED | Returns hardcoded []             |
+| /api/chat POST | database       | prisma.message.create   | ✗ NOT WIRED | Returns { ok: true }, no DB call |
+
+**Wiring:** 0/4 connections verified
+
+## Requirements Coverage
+
+| Requirement                     | Status    | Blocking Issue           |
+| ------------------------------- | --------- | ------------------------ |
+| CHAT-01: User can send message  | ✗ BLOCKED | API POST is stub         |
+| CHAT-02: User can view messages | ✗ BLOCKED | Component is placeholder |
+| CHAT-03: Messages persist       | ✗ BLOCKED | No database integration  |
+
+**Coverage:** 0/3 requirements satisfied
+
+## Anti-Patterns Found
+
+| File                      | Line | Pattern                        | Severity  | Impact            |
+| ------------------------- | ---- | ------------------------------ | --------- | ----------------- |
+| src/components/Chat.tsx   | 8    | `<div>Chat will be here</div>` | 🛑 Blocker | No actual content |
+| src/app/api/chat/route.ts | 5    | `return Response.json([])`     | 🛑 Blocker | Hardcoded empty   |
+| src/app/api/chat/route.ts | 12   | `// TODO: save to database`    | ⚠️ Warning | Incomplete        |
+
+**Anti-patterns:** 3 found (2 blockers, 1 warning)
+
+## Human Verification Required
+
+None needed until automated gaps are fixed.
+
+## Gaps Summary
+
+### Critical Gaps (Block Progress)
+
+1. **Chat component is placeholder**
+   - Missing: Actual message list rendering
+   - Impact: Users see "Chat will be here" instead of messages
+   - Fix: Implement Chat.tsx to fetch and render messages
+
+2. **API routes are stubs**
+   - Missing: Database integration in GET and POST
+   - Impact: No data persistence, no real functionality
+   - Fix: Wire prisma calls in route handlers
+
+3. **No wiring between frontend and backend**
+   - Missing: fetch calls in components
+   - Impact: Even if API worked, UI wouldn't call it
+   - Fix: Add useEffect fetch in Chat, onSubmit fetch in ChatInput
+
+## Recommended Fix Plans
+
+### 03-04-PLAN.md: Implement Chat API
+
+**Objective:** Wire API routes to database
+
+**Tasks:**
+1. Implement GET /api/chat with prisma.message.findMany
+2. Implement POST /api/chat with prisma.message.create
+3. Verify: API returns real data, POST creates records
+
+**Estimated scope:** Small
+
+---
+
+### 03-05-PLAN.md: Implement Chat UI
+
+**Objective:** Wire Chat component to API
+
+**Tasks:**
+1. Implement Chat.tsx with useEffect fetch and message rendering
+2. Wire ChatInput onSubmit to POST /api/chat
+3. Verify: Messages display, new messages appear after send
+
+**Estimated scope:** Small
+
+---
+
+## Verification Metadata
+
+**Verification approach:** Goal-backward (derived from phase goal)
+**Must-haves source:** 03-01-PLAN.md frontmatter
+**Automated checks:** 2 passed, 8 failed
+**Human checks required:** 0 (blocked by automated failures)
+**Total verification time:** 2 min
+
+---
+*Verified: 2025-01-15T14:30:00Z*
+*Verifier: the agent (subagent)*
+```