694 lines
18 KiB
Markdown
694 lines
18 KiB
Markdown
<gsd-version v="1.12.4" />
|
|
|
|
<gsd-arguments>
|
|
<settings>
|
|
<keep-extra-args />
|
|
</settings>
|
|
<arg name="phase" type="number" optional />
|
|
<arg name="plan" type="flag" flag="--plan" optional />
|
|
</gsd-arguments>
|
|
|
|
<gsd-execute>
|
|
<if>
|
|
<condition>
|
|
<equals>
|
|
<left name="phase" />
|
|
<right type="string" value="" />
|
|
</equals>
|
|
</condition>
|
|
<else>
|
|
<shell command="pi-gsd-tools">
|
|
<args>
|
|
<arg string="init" />
|
|
<arg string="verify-work" />
|
|
<arg name="phase" wrap='"' />
|
|
</args>
|
|
<outs>
|
|
<out type="string" name="init" />
|
|
</outs>
|
|
</shell>
|
|
<if>
|
|
<condition>
|
|
<starts-with>
|
|
<left name="init" />
|
|
<right type="string" value="@file:" />
|
|
</starts-with>
|
|
</condition>
|
|
<then>
|
|
<string-op op="split">
|
|
<args>
|
|
<arg name="init" />
|
|
<arg type="string" value="@file:" />
|
|
</args>
|
|
<outs>
|
|
<out type="string" name="init-file" />
|
|
</outs>
|
|
</string-op>
|
|
<shell command="cat">
|
|
<args>
|
|
<arg name="init-file" wrap='"' />
|
|
</args>
|
|
<outs>
|
|
<out type="string" name="init" />
|
|
</outs>
|
|
</shell>
|
|
</then>
|
|
</if>
|
|
<shell command="pi-gsd-tools">
|
|
<args>
|
|
<arg string="agent-skills" />
|
|
<arg string="gsd-planner" />
|
|
</args>
|
|
<outs>
|
|
<suppress-errors />
|
|
<out type="string" name="agent-skills-planner" />
|
|
</outs>
|
|
</shell>
|
|
<shell command="pi-gsd-tools">
|
|
<args>
|
|
<arg string="agent-skills" />
|
|
<arg string="gsd-plan-checker" />
|
|
</args>
|
|
<outs>
|
|
<suppress-errors />
|
|
<out type="string" name="agent-skills-checker" />
|
|
</outs>
|
|
</shell>
|
|
</else>
|
|
</if>
|
|
</gsd-execute>
|
|
|
|
## Initialization Context (pre-injected by WXP)
|
|
|
|
**Phase:** <gsd-paste name="phase" />
|
|
|
|
**Phase Init Data:**
|
|
<gsd-paste name="init" />
|
|
|
|
<process>
|
|
|
|
<step name="initialize" priority="first">
|
|
If $ARGUMENTS contains a phase number, load context:
|
|
|
|
<!-- Context pre-injected above via WXP - variables available via <gsd-paste name="..."> -->
|
|
|
|
Parse JSON for: `planner_model`, `checker_model`, `commit_docs`, `phase_found`, `phase_dir`, `phase_number`, `phase_name`, `has_verification`, `uat_path`.
|
|
</step>
|
|
|
|
<step name="check_active_session">
|
|
**First: Check for active UAT sessions**
|
|
|
|
```bash
|
|
(find .planning/phases -name "*-UAT.md" -type f 2>/dev/null || true) | head -5
|
|
```
|
|
|
|
**If active sessions exist AND no $ARGUMENTS provided:**
|
|
|
|
Read each file's frontmatter (status, phase) and Current Test section.
|
|
|
|
Display inline:
|
|
|
|
```
|
|
## Active UAT Sessions
|
|
|
|
| # | Phase | Status | Current Test | Progress |
|
|
| --- | ----------- | ------- | ------------------- | -------- |
|
|
| 1 | 04-comments | testing | 3. Reply to Comment | 2/6 |
|
|
| 2 | 05-auth | testing | 1. Login Form | 0/4 |
|
|
|
|
Reply with a number to resume, or provide a phase number to start new.
|
|
```
|
|
|
|
Wait for user response.
|
|
|
|
- If user replies with number (1, 2) → Load that file, go to `resume_from_file`
|
|
- If user replies with phase number → Treat as new session, go to `create_uat_file`
|
|
|
|
**If active sessions exist AND $ARGUMENTS provided:**
|
|
|
|
Check if session exists for that phase. If yes, offer to resume or restart.
|
|
If no, continue to `create_uat_file`.
|
|
|
|
**If no active sessions AND no $ARGUMENTS:**
|
|
|
|
```
|
|
No active UAT sessions.
|
|
|
|
Provide a phase number to start testing (e.g., /gsd-verify-work 4)
|
|
```
|
|
|
|
**If no active sessions AND $ARGUMENTS provided:**
|
|
|
|
Continue to `create_uat_file`.
|
|
</step>
|
|
|
|
<step name="find_summaries">
|
|
**Find what to test:**
|
|
|
|
Use `phase_dir` from init (or run init if not already done).
|
|
|
|
```bash
|
|
ls "$phase_dir"/*-SUMMARY.md 2>/dev/null || true
|
|
```
|
|
|
|
Read each SUMMARY.md to extract testable deliverables.
|
|
</step>
|
|
|
|
<step name="extract_tests">
|
|
**Extract testable deliverables from SUMMARY.md:**
|
|
|
|
Parse for:
|
|
1. **Accomplishments** - Features/functionality added
|
|
2. **User-facing changes** - UI, workflows, interactions
|
|
|
|
Focus on USER-OBSERVABLE outcomes, not implementation details.
|
|
|
|
For each deliverable, create a test:
|
|
- name: Brief test name
|
|
- expected: What the user should see/experience (specific, observable)
|
|
|
|
Examples:
|
|
- Accomplishment: "Added comment threading with infinite nesting"
|
|
→ Test: "Reply to a Comment"
|
|
→ Expected: "Clicking Reply opens inline composer below comment. Submitting shows reply nested under parent with visual indentation."
|
|
|
|
Skip internal/non-observable items (refactors, type changes, etc.).
|
|
|
|
**Cold-start smoke test injection:**
|
|
|
|
After extracting tests from SUMMARYs, scan the SUMMARY files for modified/created file paths. If ANY path matches these patterns:
|
|
|
|
`server.ts`, `server.js`, `app.ts`, `app.js`, `index.ts`, `index.js`, `main.ts`, `main.js`, `database/*`, `db/*`, `seed/*`, `seeds/*`, `migrations/*`, `startup*`, `docker-compose*`, `Dockerfile*`
|
|
|
|
Then **prepend** this test to the test list:
|
|
|
|
- name: "Cold Start Smoke Test"
|
|
- expected: "Kill any running server/service. Clear ephemeral state (temp DBs, caches, lock files). Start the application from scratch. Server boots without errors, any seed/migration completes, and a primary query (health check, homepage load, or basic API call) returns live data."
|
|
|
|
This catches bugs that only manifest on fresh start - race conditions in startup sequences, silent seed failures, missing environment setup - which pass against warm state but break in production.
|
|
</step>
|
|
|
|
<step name="create_uat_file">
|
|
**Create UAT file with all tests:**
|
|
|
|
```bash
|
|
mkdir -p "$PHASE_DIR"
|
|
```
|
|
|
|
Build test list from extracted deliverables.
|
|
|
|
Create file:
|
|
|
|
```markdown
|
|
---
|
|
status: testing
|
|
phase: XX-name
|
|
source: [list of SUMMARY.md files]
|
|
started: [ISO timestamp]
|
|
updated: [ISO timestamp]
|
|
---
|
|
|
|
## Current Test
|
|
<!-- OVERWRITE each test - shows where we are -->
|
|
|
|
number: 1
|
|
name: [first test name]
|
|
expected: |
|
|
[what user should observe]
|
|
awaiting: user response
|
|
|
|
## Tests
|
|
|
|
### 1. [Test Name]
|
|
expected: [observable behavior]
|
|
result: [pending]
|
|
|
|
### 2. [Test Name]
|
|
expected: [observable behavior]
|
|
result: [pending]
|
|
|
|
...
|
|
|
|
## Summary
|
|
|
|
total: [N]
|
|
passed: 0
|
|
issues: 0
|
|
pending: [N]
|
|
skipped: 0
|
|
|
|
## Gaps
|
|
|
|
[none yet]
|
|
```
|
|
|
|
Write to `.planning/phases/XX-name/{phase_num}-UAT.md`
|
|
|
|
Proceed to `present_test`.
|
|
</step>
|
|
|
|
<step name="present_test">
|
|
**Present current test to user:**
|
|
|
|
Render the checkpoint from the structured UAT file instead of composing it freehand:
|
|
|
|
```bash
|
|
CHECKPOINT=$(pi-gsd-tools uat render-checkpoint --file "$uat_path" --raw)
|
|
if [[ "$CHECKPOINT" == @file:* ]]; then CHECKPOINT=$(cat "${CHECKPOINT#@file:}"); fi
|
|
```
|
|
|
|
Display the returned checkpoint EXACTLY as-is:
|
|
|
|
```
|
|
{CHECKPOINT}
|
|
```
|
|
|
|
**Critical response hygiene:**
|
|
- Your entire response MUST equal `{CHECKPOINT}` byte-for-byte.
|
|
- Do NOT add commentary before or after the block.
|
|
- If you notice protocol/meta markers such as `to=all:`, role-routing text, XML system tags, hidden instruction markers, ad copy, or any unrelated suffix, discard the draft and output `{CHECKPOINT}` only.
|
|
|
|
Wait for user response (plain text, no AskUserQuestion).
|
|
</step>
|
|
|
|
<step name="process_response">
|
|
**Process user response and update file:**
|
|
|
|
**If response indicates pass:**
|
|
- Empty response, "yes", "y", "ok", "pass", "next", "approved", "✓"
|
|
|
|
Update Tests section:
|
|
```
|
|
### {N}. {name}
|
|
expected: {expected}
|
|
result: pass
|
|
```
|
|
|
|
**If response indicates skip:**
|
|
- "skip", "can't test", "n/a"
|
|
|
|
Update Tests section:
|
|
```
|
|
### {N}. {name}
|
|
expected: {expected}
|
|
result: skipped
|
|
reason: [user's reason if provided]
|
|
```
|
|
|
|
**If response indicates blocked:**
|
|
- "blocked", "can't test - server not running", "need physical device", "need release build"
|
|
- Or any response containing: "server", "blocked", "not running", "physical device", "release build"
|
|
|
|
Infer blocked_by tag from response:
|
|
- Contains: server, not running, gateway, API → `server`
|
|
- Contains: physical, device, hardware, real phone → `physical-device`
|
|
- Contains: release, preview, build, EAS → `release-build`
|
|
- Contains: stripe, twilio, third-party, configure → `third-party`
|
|
- Contains: depends on, prior phase, prerequisite → `prior-phase`
|
|
- Default: `other`
|
|
|
|
Update Tests section:
|
|
```
|
|
### {N}. {name}
|
|
expected: {expected}
|
|
result: blocked
|
|
blocked_by: {inferred tag}
|
|
reason: "{verbatim user response}"
|
|
```
|
|
|
|
Note: Blocked tests do NOT go into the Gaps section (they aren't code issues - they're prerequisite gates).
|
|
|
|
**If response is anything else:**
|
|
- Treat as issue description
|
|
|
|
Infer severity from description:
|
|
- Contains: crash, error, exception, fails, broken, unusable → blocker
|
|
- Contains: doesn't work, wrong, missing, can't → major
|
|
- Contains: slow, weird, off, minor, small → minor
|
|
- Contains: color, font, spacing, alignment, visual → cosmetic
|
|
- Default if unclear: major
|
|
|
|
Update Tests section:
|
|
```
|
|
### {N}. {name}
|
|
expected: {expected}
|
|
result: issue
|
|
reported: "{verbatim user response}"
|
|
severity: {inferred}
|
|
```
|
|
|
|
Append to Gaps section (structured YAML for plan-phase --gaps):
|
|
```yaml
|
|
- truth: "{expected behavior from test}"
|
|
status: failed
|
|
reason: "User reported: {verbatim user response}"
|
|
severity: {inferred}
|
|
test: {N}
|
|
artifacts: [] # Filled by diagnosis
|
|
missing: [] # Filled by diagnosis
|
|
```
|
|
|
|
**After any response:**
|
|
|
|
Update Summary counts.
|
|
Update frontmatter.updated timestamp.
|
|
|
|
If more tests remain → Update Current Test, go to `present_test`
|
|
If no more tests → Go to `complete_session`
|
|
</step>
|
|
|
|
<step name="resume_from_file">
|
|
**Resume testing from UAT file:**
|
|
|
|
Read the full UAT file.
|
|
|
|
Find first test with `result: [pending]`.
|
|
|
|
Announce:
|
|
```
|
|
Resuming: Phase {phase} UAT
|
|
Progress: {passed + issues + skipped}/{total}
|
|
Issues found so far: {issues count}
|
|
|
|
Continuing from Test {N}...
|
|
```
|
|
|
|
Update Current Test section with the pending test.
|
|
Proceed to `present_test`.
|
|
</step>
|
|
|
|
<step name="complete_session">
|
|
**Complete testing and commit:**
|
|
|
|
**Determine final status:**
|
|
|
|
Count results:
|
|
- `pending_count`: tests with `result: [pending]`
|
|
- `blocked_count`: tests with `result: blocked`
|
|
- `skipped_no_reason`: tests with `result: skipped` and no `reason` field
|
|
|
|
```
|
|
if pending_count > 0 OR blocked_count > 0 OR skipped_no_reason > 0:
|
|
status: partial
|
|
# Session ended but not all tests resolved
|
|
else:
|
|
status: complete
|
|
# All tests have a definitive result (pass, issue, or skipped-with-reason)
|
|
```
|
|
|
|
Update frontmatter:
|
|
- status: {computed status}
|
|
- updated: [now]
|
|
|
|
Clear Current Test section:
|
|
```
|
|
## Current Test
|
|
|
|
[testing complete]
|
|
```
|
|
|
|
Commit the UAT file:
|
|
```bash
|
|
pi-gsd-tools commit "test({phase_num}): complete UAT - {passed} passed, {issues} issues" --files ".planning/phases/XX-name/{phase_num}-UAT.md"
|
|
```
|
|
|
|
Present summary:
|
|
```
|
|
## UAT Complete: Phase {phase}
|
|
|
|
| Result | Count |
|
|
| ------- | ----- |
|
|
| Passed | {N} |
|
|
| Issues | {N} |
|
|
| Skipped | {N} |
|
|
|
|
[If issues > 0:]
|
|
### Issues Found
|
|
|
|
[List from Issues section]
|
|
```
|
|
|
|
**If issues > 0:** Proceed to `diagnose_issues`
|
|
|
|
**If issues == 0:**
|
|
```
|
|
All tests passed. Ready to continue.
|
|
|
|
- `/gsd-plan-phase {next}` - Plan next phase
|
|
- `/gsd-execute-phase {next}` - Execute next phase
|
|
- `/gsd-ui-review {phase}` - visual quality audit (if frontend files were modified)
|
|
```
|
|
</step>
|
|
|
|
<step name="diagnose_issues">
|
|
**Diagnose root causes before planning fixes:**
|
|
|
|
```
|
|
---
|
|
|
|
{N} issues found. Diagnosing root causes...
|
|
|
|
Spawning parallel debug agents to investigate each issue.
|
|
```
|
|
|
|
- Load diagnose-issues workflow
|
|
- Follow @.pi/gsd/workflows/diagnose-issues.md
|
|
- Spawn parallel debug agents for each issue
|
|
- Collect root causes
|
|
- Update UAT.md with root causes
|
|
- Proceed to `plan_gap_closure`
|
|
|
|
Diagnosis runs automatically - no user prompt. Parallel agents investigate simultaneously, so overhead is minimal and fixes are more accurate.
|
|
</step>
|
|
|
|
<step name="plan_gap_closure">
|
|
**Auto-plan fixes from diagnosed gaps:**
|
|
|
|
Display:
|
|
```
|
|
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
|
|
GSD ► PLANNING FIXES
|
|
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
|
|
|
|
◆ Spawning planner for gap closure...
|
|
```
|
|
|
|
Spawn gsd-planner in --gaps mode:
|
|
|
|
```
|
|
Task(
|
|
prompt="""
|
|
<planning_context>
|
|
|
|
**Phase:** {phase_number}
|
|
**Mode:** gap_closure
|
|
|
|
<files_to_read>
|
|
- {phase_dir}/{phase_num}-UAT.md (UAT with diagnoses)
|
|
- .planning/STATE.md (Project State)
|
|
- .planning/ROADMAP.md (Roadmap)
|
|
</files_to_read>
|
|
|
|
${AGENT_SKILLS_PLANNER}
|
|
|
|
</planning_context>
|
|
|
|
<downstream_consumer>
|
|
Output consumed by /gsd-execute-phase
|
|
Plans must be executable prompts.
|
|
</downstream_consumer>
|
|
""",
|
|
subagent_type="gsd-planner",
|
|
model="{planner_model}",
|
|
description="Plan gap fixes for Phase {phase}"
|
|
)
|
|
```
|
|
|
|
On return:
|
|
- **PLANNING COMPLETE:** Proceed to `verify_gap_plans`
|
|
- **PLANNING INCONCLUSIVE:** Report and offer manual intervention
|
|
</step>
|
|
|
|
<step name="verify_gap_plans">
|
|
**Verify fix plans with checker:**
|
|
|
|
Display:
|
|
```
|
|
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
|
|
GSD ► VERIFYING FIX PLANS
|
|
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
|
|
|
|
◆ Spawning plan checker...
|
|
```
|
|
|
|
Initialize: `iteration_count = 1`
|
|
|
|
Spawn gsd-plan-checker:
|
|
|
|
```
|
|
Task(
|
|
prompt="""
|
|
<verification_context>
|
|
|
|
**Phase:** {phase_number}
|
|
**Phase Goal:** Close diagnosed gaps from UAT
|
|
|
|
<files_to_read>
|
|
- {phase_dir}/*-PLAN.md (Plans to verify)
|
|
</files_to_read>
|
|
|
|
${AGENT_SKILLS_CHECKER}
|
|
|
|
</verification_context>
|
|
|
|
<expected_output>
|
|
Return one of:
|
|
- ## VERIFICATION PASSED - all checks pass
|
|
- ## ISSUES FOUND - structured issue list
|
|
</expected_output>
|
|
""",
|
|
subagent_type="gsd-plan-checker",
|
|
model="{checker_model}",
|
|
description="Verify Phase {phase} fix plans"
|
|
)
|
|
```
|
|
|
|
On return:
|
|
- **VERIFICATION PASSED:** Proceed to `present_ready`
|
|
- **ISSUES FOUND:** Proceed to `revision_loop`
|
|
</step>
|
|
|
|
<step name="revision_loop">
|
|
**Iterate planner ↔ checker until plans pass (max 3):**
|
|
|
|
**If iteration_count < 3:**
|
|
|
|
Display: `Sending back to planner for revision... (iteration {N}/3)`
|
|
|
|
Spawn gsd-planner with revision context:
|
|
|
|
```
|
|
Task(
|
|
prompt="""
|
|
<revision_context>
|
|
|
|
**Phase:** {phase_number}
|
|
**Mode:** revision
|
|
|
|
<files_to_read>
|
|
- {phase_dir}/*-PLAN.md (Existing plans)
|
|
</files_to_read>
|
|
|
|
${AGENT_SKILLS_PLANNER}
|
|
|
|
**Checker issues:**
|
|
{structured_issues_from_checker}
|
|
|
|
</revision_context>
|
|
|
|
<instructions>
|
|
Read existing PLAN.md files. Make targeted updates to address checker issues.
|
|
Do NOT replan from scratch unless issues are fundamental.
|
|
</instructions>
|
|
""",
|
|
subagent_type="gsd-planner",
|
|
model="{planner_model}",
|
|
description="Revise Phase {phase} plans"
|
|
)
|
|
```
|
|
|
|
After planner returns → spawn checker again (verify_gap_plans logic)
|
|
Increment iteration_count
|
|
|
|
**If iteration_count >= 3:**
|
|
|
|
Display: `Max iterations reached. {N} issues remain.`
|
|
|
|
Offer options:
|
|
1. Force proceed (execute despite issues)
|
|
2. Provide guidance (user gives direction, retry)
|
|
3. Abandon (exit, user runs /gsd-plan-phase manually)
|
|
|
|
Wait for user response.
|
|
</step>
|
|
|
|
<step name="present_ready">
|
|
**Present completion and next steps:**
|
|
|
|
```
|
|
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
|
|
GSD ► FIXES READY ✓
|
|
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
|
|
|
|
**Phase {X}: {Name}** - {N} gap(s) diagnosed, {M} fix plan(s) created
|
|
|
|
| Gap | Root Cause | Fix Plan |
|
|
| --------- | ------------ | ---------- |
|
|
| {truth 1} | {root_cause} | {phase}-04 |
|
|
| {truth 2} | {root_cause} | {phase}-04 |
|
|
|
|
Plans verified and ready for execution.
|
|
|
|
───────────────────────────────────────────────────────────────
|
|
|
|
## ▶ Next Up
|
|
|
|
**Execute fixes** - run fix plans
|
|
|
|
`/new` then `/gsd-execute-phase {phase} --gaps-only`
|
|
|
|
───────────────────────────────────────────────────────────────
|
|
```
|
|
</step>
|
|
|
|
</process>
|
|
|
|
<update_rules>
|
|
**Batched writes for efficiency:**
|
|
|
|
Keep results in memory. Write to file only when:
|
|
1. **Issue found** - Preserve the problem immediately
|
|
2. **Session complete** - Final write before commit
|
|
3. **Checkpoint** - Every 5 passed tests (safety net)
|
|
|
|
| Section | Rule | When Written |
|
|
| ------------------- | --------- | ----------------- |
|
|
| Frontmatter.status | OVERWRITE | Start, complete |
|
|
| Frontmatter.updated | OVERWRITE | On any file write |
|
|
| Current Test | OVERWRITE | On any file write |
|
|
| Tests.{N}.result | OVERWRITE | On any file write |
|
|
| Summary | OVERWRITE | On any file write |
|
|
| Gaps | APPEND | When issue found |
|
|
|
|
On context reset: File shows last checkpoint. Resume from there.
|
|
</update_rules>
|
|
|
|
<severity_inference>
|
|
**Infer severity from user's natural language:**
|
|
|
|
| User says | Infer |
|
|
| --------------------------------------------------- | -------- |
|
|
| "crashes", "error", "exception", "fails completely" | blocker |
|
|
| "doesn't work", "nothing happens", "wrong behavior" | major |
|
|
| "works but...", "slow", "weird", "minor issue" | minor |
|
|
| "color", "spacing", "alignment", "looks off" | cosmetic |
|
|
|
|
Default to **major** if unclear. User can correct if needed.
|
|
|
|
**Never ask "how severe is this?"** - just infer and move on.
|
|
</severity_inference>
|
|
|
|
<success_criteria>
|
|
- [ ] UAT file created with all tests from SUMMARY.md
|
|
- [ ] Tests presented one at a time with expected behavior
|
|
- [ ] User responses processed as pass/issue/skip
|
|
- [ ] Severity inferred from description (never asked)
|
|
- [ ] Batched writes: on issue, every 5 passes, or completion
|
|
- [ ] Committed on completion
|
|
- [ ] If issues: parallel debug agents diagnose root causes
|
|
- [ ] If issues: gsd-planner creates fix plans (gap_closure mode)
|
|
- [ ] If issues: gsd-plan-checker verifies fix plans
|
|
- [ ] If issues: revision loop until plans pass (max 3 iterations)
|
|
- [ ] Ready for `/gsd-execute-phase --gaps-only` when complete
|
|
</success_criteria>
|