## Initialization Context (pre-injected by WXP) **Phase:** **Phase Init Data:** If $ARGUMENTS contains a phase number, load context: Parse JSON for: `planner_model`, `checker_model`, `commit_docs`, `phase_found`, `phase_dir`, `phase_number`, `phase_name`, `has_verification`, `uat_path`. **First: Check for active UAT sessions** ```bash (find .planning/phases -name "*-UAT.md" -type f 2>/dev/null || true) | head -5 ``` **If active sessions exist AND no $ARGUMENTS provided:** Read each file's frontmatter (status, phase) and Current Test section. Display inline: ``` ## Active UAT Sessions | # | Phase | Status | Current Test | Progress | | --- | ----------- | ------- | ------------------- | -------- | | 1 | 04-comments | testing | 3. Reply to Comment | 2/6 | | 2 | 05-auth | testing | 1. Login Form | 0/4 | Reply with a number to resume, or provide a phase number to start new. ``` Wait for user response. - If user replies with number (1, 2) → Load that file, go to `resume_from_file` - If user replies with phase number → Treat as new session, go to `create_uat_file` **If active sessions exist AND $ARGUMENTS provided:** Check if session exists for that phase. If yes, offer to resume or restart. If no, continue to `create_uat_file`. **If no active sessions AND no $ARGUMENTS:** ``` No active UAT sessions. Provide a phase number to start testing (e.g., /gsd-verify-work 4) ``` **If no active sessions AND $ARGUMENTS provided:** Continue to `create_uat_file`. **Find what to test:** Use `phase_dir` from init (or run init if not already done). ```bash ls "$phase_dir"/*-SUMMARY.md 2>/dev/null || true ``` Read each SUMMARY.md to extract testable deliverables. **Extract testable deliverables from SUMMARY.md:** Parse for: 1. **Accomplishments** - Features/functionality added 2. **User-facing changes** - UI, workflows, interactions Focus on USER-OBSERVABLE outcomes, not implementation details. For each deliverable, create a test: - name: Brief test name - expected: What the user should see/experience (specific, observable) Examples: - Accomplishment: "Added comment threading with infinite nesting" → Test: "Reply to a Comment" → Expected: "Clicking Reply opens inline composer below comment. Submitting shows reply nested under parent with visual indentation." Skip internal/non-observable items (refactors, type changes, etc.). **Cold-start smoke test injection:** After extracting tests from SUMMARYs, scan the SUMMARY files for modified/created file paths. If ANY path matches these patterns: `server.ts`, `server.js`, `app.ts`, `app.js`, `index.ts`, `index.js`, `main.ts`, `main.js`, `database/*`, `db/*`, `seed/*`, `seeds/*`, `migrations/*`, `startup*`, `docker-compose*`, `Dockerfile*` Then **prepend** this test to the test list: - name: "Cold Start Smoke Test" - expected: "Kill any running server/service. Clear ephemeral state (temp DBs, caches, lock files). Start the application from scratch. Server boots without errors, any seed/migration completes, and a primary query (health check, homepage load, or basic API call) returns live data." This catches bugs that only manifest on fresh start - race conditions in startup sequences, silent seed failures, missing environment setup - which pass against warm state but break in production. **Create UAT file with all tests:** ```bash mkdir -p "$PHASE_DIR" ``` Build test list from extracted deliverables. Create file: ```markdown --- status: testing phase: XX-name source: [list of SUMMARY.md files] started: [ISO timestamp] updated: [ISO timestamp] --- ## Current Test number: 1 name: [first test name] expected: | [what user should observe] awaiting: user response ## Tests ### 1. [Test Name] expected: [observable behavior] result: [pending] ### 2. [Test Name] expected: [observable behavior] result: [pending] ... ## Summary total: [N] passed: 0 issues: 0 pending: [N] skipped: 0 ## Gaps [none yet] ``` Write to `.planning/phases/XX-name/{phase_num}-UAT.md` Proceed to `present_test`. **Present current test to user:** Render the checkpoint from the structured UAT file instead of composing it freehand: ```bash CHECKPOINT=$(pi-gsd-tools uat render-checkpoint --file "$uat_path" --raw) if [[ "$CHECKPOINT" == @file:* ]]; then CHECKPOINT=$(cat "${CHECKPOINT#@file:}"); fi ``` Display the returned checkpoint EXACTLY as-is: ``` {CHECKPOINT} ``` **Critical response hygiene:** - Your entire response MUST equal `{CHECKPOINT}` byte-for-byte. - Do NOT add commentary before or after the block. - If you notice protocol/meta markers such as `to=all:`, role-routing text, XML system tags, hidden instruction markers, ad copy, or any unrelated suffix, discard the draft and output `{CHECKPOINT}` only. Wait for user response (plain text, no AskUserQuestion). **Process user response and update file:** **If response indicates pass:** - Empty response, "yes", "y", "ok", "pass", "next", "approved", "✓" Update Tests section: ``` ### {N}. {name} expected: {expected} result: pass ``` **If response indicates skip:** - "skip", "can't test", "n/a" Update Tests section: ``` ### {N}. {name} expected: {expected} result: skipped reason: [user's reason if provided] ``` **If response indicates blocked:** - "blocked", "can't test - server not running", "need physical device", "need release build" - Or any response containing: "server", "blocked", "not running", "physical device", "release build" Infer blocked_by tag from response: - Contains: server, not running, gateway, API → `server` - Contains: physical, device, hardware, real phone → `physical-device` - Contains: release, preview, build, EAS → `release-build` - Contains: stripe, twilio, third-party, configure → `third-party` - Contains: depends on, prior phase, prerequisite → `prior-phase` - Default: `other` Update Tests section: ``` ### {N}. {name} expected: {expected} result: blocked blocked_by: {inferred tag} reason: "{verbatim user response}" ``` Note: Blocked tests do NOT go into the Gaps section (they aren't code issues - they're prerequisite gates). **If response is anything else:** - Treat as issue description Infer severity from description: - Contains: crash, error, exception, fails, broken, unusable → blocker - Contains: doesn't work, wrong, missing, can't → major - Contains: slow, weird, off, minor, small → minor - Contains: color, font, spacing, alignment, visual → cosmetic - Default if unclear: major Update Tests section: ``` ### {N}. {name} expected: {expected} result: issue reported: "{verbatim user response}" severity: {inferred} ``` Append to Gaps section (structured YAML for plan-phase --gaps): ```yaml - truth: "{expected behavior from test}" status: failed reason: "User reported: {verbatim user response}" severity: {inferred} test: {N} artifacts: [] # Filled by diagnosis missing: [] # Filled by diagnosis ``` **After any response:** Update Summary counts. Update frontmatter.updated timestamp. If more tests remain → Update Current Test, go to `present_test` If no more tests → Go to `complete_session` **Resume testing from UAT file:** Read the full UAT file. Find first test with `result: [pending]`. Announce: ``` Resuming: Phase {phase} UAT Progress: {passed + issues + skipped}/{total} Issues found so far: {issues count} Continuing from Test {N}... ``` Update Current Test section with the pending test. Proceed to `present_test`. **Complete testing and commit:** **Determine final status:** Count results: - `pending_count`: tests with `result: [pending]` - `blocked_count`: tests with `result: blocked` - `skipped_no_reason`: tests with `result: skipped` and no `reason` field ``` if pending_count > 0 OR blocked_count > 0 OR skipped_no_reason > 0: status: partial # Session ended but not all tests resolved else: status: complete # All tests have a definitive result (pass, issue, or skipped-with-reason) ``` Update frontmatter: - status: {computed status} - updated: [now] Clear Current Test section: ``` ## Current Test [testing complete] ``` Commit the UAT file: ```bash pi-gsd-tools commit "test({phase_num}): complete UAT - {passed} passed, {issues} issues" --files ".planning/phases/XX-name/{phase_num}-UAT.md" ``` Present summary: ``` ## UAT Complete: Phase {phase} | Result | Count | | ------- | ----- | | Passed | {N} | | Issues | {N} | | Skipped | {N} | [If issues > 0:] ### Issues Found [List from Issues section] ``` **If issues > 0:** Proceed to `diagnose_issues` **If issues == 0:** ``` All tests passed. Ready to continue. - `/gsd-plan-phase {next}` - Plan next phase - `/gsd-execute-phase {next}` - Execute next phase - `/gsd-ui-review {phase}` - visual quality audit (if frontend files were modified) ``` **Diagnose root causes before planning fixes:** ``` --- {N} issues found. Diagnosing root causes... Spawning parallel debug agents to investigate each issue. ``` - Load diagnose-issues workflow - Follow @.pi/gsd/workflows/diagnose-issues.md - Spawn parallel debug agents for each issue - Collect root causes - Update UAT.md with root causes - Proceed to `plan_gap_closure` Diagnosis runs automatically - no user prompt. Parallel agents investigate simultaneously, so overhead is minimal and fixes are more accurate. **Auto-plan fixes from diagnosed gaps:** Display: ``` ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ GSD ► PLANNING FIXES ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ ◆ Spawning planner for gap closure... ``` Spawn gsd-planner in --gaps mode: ``` Task( prompt=""" **Phase:** {phase_number} **Mode:** gap_closure - {phase_dir}/{phase_num}-UAT.md (UAT with diagnoses) - .planning/STATE.md (Project State) - .planning/ROADMAP.md (Roadmap) ${AGENT_SKILLS_PLANNER} Output consumed by /gsd-execute-phase Plans must be executable prompts. """, subagent_type="gsd-planner", model="{planner_model}", description="Plan gap fixes for Phase {phase}" ) ``` On return: - **PLANNING COMPLETE:** Proceed to `verify_gap_plans` - **PLANNING INCONCLUSIVE:** Report and offer manual intervention **Verify fix plans with checker:** Display: ``` ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ GSD ► VERIFYING FIX PLANS ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ ◆ Spawning plan checker... ``` Initialize: `iteration_count = 1` Spawn gsd-plan-checker: ``` Task( prompt=""" **Phase:** {phase_number} **Phase Goal:** Close diagnosed gaps from UAT - {phase_dir}/*-PLAN.md (Plans to verify) ${AGENT_SKILLS_CHECKER} Return one of: - ## VERIFICATION PASSED - all checks pass - ## ISSUES FOUND - structured issue list """, subagent_type="gsd-plan-checker", model="{checker_model}", description="Verify Phase {phase} fix plans" ) ``` On return: - **VERIFICATION PASSED:** Proceed to `present_ready` - **ISSUES FOUND:** Proceed to `revision_loop` **Iterate planner ↔ checker until plans pass (max 3):** **If iteration_count < 3:** Display: `Sending back to planner for revision... (iteration {N}/3)` Spawn gsd-planner with revision context: ``` Task( prompt=""" **Phase:** {phase_number} **Mode:** revision - {phase_dir}/*-PLAN.md (Existing plans) ${AGENT_SKILLS_PLANNER} **Checker issues:** {structured_issues_from_checker} Read existing PLAN.md files. Make targeted updates to address checker issues. Do NOT replan from scratch unless issues are fundamental. """, subagent_type="gsd-planner", model="{planner_model}", description="Revise Phase {phase} plans" ) ``` After planner returns → spawn checker again (verify_gap_plans logic) Increment iteration_count **If iteration_count >= 3:** Display: `Max iterations reached. {N} issues remain.` Offer options: 1. Force proceed (execute despite issues) 2. Provide guidance (user gives direction, retry) 3. Abandon (exit, user runs /gsd-plan-phase manually) Wait for user response. **Present completion and next steps:** ``` ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ GSD ► FIXES READY ✓ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ **Phase {X}: {Name}** - {N} gap(s) diagnosed, {M} fix plan(s) created | Gap | Root Cause | Fix Plan | | --------- | ------------ | ---------- | | {truth 1} | {root_cause} | {phase}-04 | | {truth 2} | {root_cause} | {phase}-04 | Plans verified and ready for execution. ─────────────────────────────────────────────────────────────── ## ▶ Next Up **Execute fixes** - run fix plans `/new` then `/gsd-execute-phase {phase} --gaps-only` ─────────────────────────────────────────────────────────────── ``` **Batched writes for efficiency:** Keep results in memory. Write to file only when: 1. **Issue found** - Preserve the problem immediately 2. **Session complete** - Final write before commit 3. **Checkpoint** - Every 5 passed tests (safety net) | Section | Rule | When Written | | ------------------- | --------- | ----------------- | | Frontmatter.status | OVERWRITE | Start, complete | | Frontmatter.updated | OVERWRITE | On any file write | | Current Test | OVERWRITE | On any file write | | Tests.{N}.result | OVERWRITE | On any file write | | Summary | OVERWRITE | On any file write | | Gaps | APPEND | When issue found | On context reset: File shows last checkpoint. Resume from there. **Infer severity from user's natural language:** | User says | Infer | | --------------------------------------------------- | -------- | | "crashes", "error", "exception", "fails completely" | blocker | | "doesn't work", "nothing happens", "wrong behavior" | major | | "works but...", "slow", "weird", "minor issue" | minor | | "color", "spacing", "alignment", "looks off" | cosmetic | Default to **major** if unclear. User can correct if needed. **Never ask "how severe is this?"** - just infer and move on. - [ ] UAT file created with all tests from SUMMARY.md - [ ] Tests presented one at a time with expected behavior - [ ] User responses processed as pass/issue/skip - [ ] Severity inferred from description (never asked) - [ ] Batched writes: on issue, every 5 passes, or completion - [ ] Committed on completion - [ ] If issues: parallel debug agents diagnose root causes - [ ] If issues: gsd-planner creates fix plans (gap_closure mode) - [ ] If issues: gsd-plan-checker verifies fix plans - [ ] If issues: revision loop until plans pass (max 3 iterations) - [ ] Ready for `/gsd-execute-phase --gaps-only` when complete