327 lines
12 KiB
Markdown
327 lines
12 KiB
Markdown
<gsd-version v="1.12.4" />
|
||
|
||
<gsd-arguments>
|
||
<settings><keep-extra-args /></settings>
|
||
<arg name="phase" type="number" />
|
||
</gsd-arguments>
|
||
|
||
<gsd-execute>
|
||
<shell command="pi-gsd-tools">
|
||
<args>
|
||
<arg string="init" />
|
||
<arg string="phase-op" />
|
||
</args>
|
||
<outs>
|
||
<out type="string" name="init" />
|
||
</outs>
|
||
</shell>
|
||
<if>
|
||
<condition>
|
||
<starts-with>
|
||
<left name="init" />
|
||
<right type="string" value="@file:" />
|
||
</starts-with>
|
||
</condition>
|
||
<then>
|
||
<string-op op="split">
|
||
<args>
|
||
<arg name="init" />
|
||
<arg type="string" value="@file:" />
|
||
</args>
|
||
<outs>
|
||
<out type="string" name="init-file" />
|
||
</outs>
|
||
</string-op>
|
||
<shell command="cat">
|
||
<args>
|
||
<arg name="init-file" wrap='"' />
|
||
</args>
|
||
<outs>
|
||
<out type="string" name="init" />
|
||
</outs>
|
||
</shell>
|
||
</then>
|
||
</if>
|
||
<shell command="pi-gsd-tools">
|
||
<args>
|
||
<arg string="roadmap" />
|
||
<arg string="get-phase" />
|
||
</args>
|
||
<outs>
|
||
<out type="string" name="roadmap-phase" />
|
||
</outs>
|
||
</shell>
|
||
<shell command="pi-gsd-tools">
|
||
<args>
|
||
<arg string="state" />
|
||
<arg string="json" />
|
||
<arg string="--raw" />
|
||
</args>
|
||
<outs>
|
||
<out type="string" name="state" />
|
||
</outs>
|
||
</shell>
|
||
</gsd-execute>
|
||
|
||
## Context (pre-injected)
|
||
|
||
**Phase:** <gsd-paste name="phase" />
|
||
|
||
**Phase Data:**
|
||
<gsd-paste name="init" />
|
||
|
||
**Roadmap:**
|
||
<gsd-paste name="roadmap-phase" />
|
||
|
||
<purpose>
|
||
Verify phase goal achievement through goal-backward analysis. Check that the codebase delivers what the phase promised, not just that tasks completed.
|
||
|
||
Executed by a verification subagent spawned from execute-phase.md.
|
||
</purpose>
|
||
|
||
<core_principle>
|
||
**Task completion ≠ Goal achievement**
|
||
|
||
A task "create chat component" can be marked complete when the component is a placeholder. The task was done - but the goal "working chat interface" was not achieved.
|
||
|
||
Goal-backward verification:
|
||
1. What must be TRUE for the goal to be achieved?
|
||
2. What must EXIST for those truths to hold?
|
||
3. What must be WIRED for those artifacts to function?
|
||
|
||
Then verify each level against the actual codebase.
|
||
</core_principle>
|
||
|
||
<required_reading>
|
||
@.pi/gsd/references/verification-patterns.md
|
||
@.pi/gsd/templates/verification-report.md
|
||
</required_reading>
|
||
|
||
<process>
|
||
|
||
<step name="load_context" priority="first">
|
||
Load phase operation context:
|
||
|
||
<!-- Context pre-injected above via WXP - variables available via <gsd-paste name="..."> -->
|
||
|
||
Extract from init JSON: `phase_dir`, `phase_number`, `phase_name`, `has_plans`, `plan_count`.
|
||
|
||
Then load phase details and list plans/summaries:
|
||
```bash
|
||
pi-gsd-tools roadmap get-phase "${phase_number}"
|
||
grep -E "^| ${phase_number}" .planning/REQUIREMENTS.md 2>/dev/null || true
|
||
ls "$phase_dir"/*-SUMMARY.md "$phase_dir"/*-PLAN.md 2>/dev/null || true
|
||
```
|
||
|
||
Extract **phase goal** from ROADMAP.md (the outcome to verify, not tasks) and **requirements** from REQUIREMENTS.md if it exists.
|
||
</step>
|
||
|
||
<step name="establish_must_haves">
|
||
**Option A: Must-haves in PLAN frontmatter**
|
||
|
||
Use gsd-tools to extract must_haves from each PLAN:
|
||
|
||
```bash
|
||
for plan in "$PHASE_DIR"/*-PLAN.md; do
|
||
MUST_HAVES=$(pi-gsd-tools frontmatter get "$plan" --field must_haves)
|
||
echo "=== $plan ===" && echo "$MUST_HAVES"
|
||
done
|
||
```
|
||
|
||
Returns JSON: `{ truths: [...], artifacts: [...], key_links: [...] }`
|
||
|
||
Aggregate all must_haves across plans for phase-level verification.
|
||
|
||
**Option B: Use Success Criteria from ROADMAP.md**
|
||
|
||
If no must_haves in frontmatter (MUST_HAVES returns error or empty), check for Success Criteria:
|
||
|
||
```bash
|
||
PHASE_DATA=$(pi-gsd-tools roadmap get-phase "${phase_number}" --raw)
|
||
```
|
||
|
||
Parse the `success_criteria` array from the JSON output. If non-empty:
|
||
1. Use each Success Criterion directly as a **truth** (they are already written as observable, testable behaviors)
|
||
2. Derive **artifacts** (concrete file paths for each truth)
|
||
3. Derive **key links** (critical wiring where stubs hide)
|
||
4. Document the must-haves before proceeding
|
||
|
||
Success Criteria from ROADMAP.md are the contract - they override PLAN-level must_haves when both exist.
|
||
|
||
**Option C: Derive from phase goal (fallback)**
|
||
|
||
If no must_haves in frontmatter AND no Success Criteria in ROADMAP:
|
||
1. State the goal from ROADMAP.md
|
||
2. Derive **truths** (3-7 observable behaviors, each testable)
|
||
3. Derive **artifacts** (concrete file paths for each truth)
|
||
4. Derive **key links** (critical wiring where stubs hide)
|
||
5. Document derived must-haves before proceeding
|
||
</step>
|
||
|
||
<step name="verify_truths">
|
||
For each observable truth, determine if the codebase enables it.
|
||
|
||
**Status:** ✓ VERIFIED (all supporting artifacts pass) | ✗ FAILED (artifact missing/stub/unwired) | ? UNCERTAIN (needs human)
|
||
|
||
For each truth: identify supporting artifacts → check artifact status → check wiring → determine truth status.
|
||
|
||
**Example:** Truth "User can see existing messages" depends on Chat.tsx (renders), /api/chat GET (provides), Message model (schema). If Chat.tsx is a stub or API returns hardcoded [] → FAILED. If all exist, are substantive, and connected → VERIFIED.
|
||
</step>
|
||
|
||
<step name="verify_artifacts">
|
||
Use gsd-tools for artifact verification against must_haves in each PLAN:
|
||
|
||
```bash
|
||
for plan in "$PHASE_DIR"/*-PLAN.md; do
|
||
ARTIFACT_RESULT=$(pi-gsd-tools verify artifacts "$plan")
|
||
echo "=== $plan ===" && echo "$ARTIFACT_RESULT"
|
||
done
|
||
```
|
||
|
||
Parse JSON result: `{ all_passed, passed, total, artifacts: [{path, exists, issues, passed}] }`
|
||
|
||
**Artifact status from result:**
|
||
- `exists=false` → MISSING
|
||
- `issues` not empty → STUB (check issues for "Only N lines" or "Missing pattern")
|
||
- `passed=true` → VERIFIED (Levels 1-2 pass)
|
||
|
||
**Level 3 - Wired (manual check for artifacts that pass Levels 1-2):**
|
||
```bash
|
||
grep -r "import.*$artifact_name" src/ --include="*.ts" --include="*.tsx" # IMPORTED
|
||
grep -r "$artifact_name" src/ --include="*.ts" --include="*.tsx" | grep -v "import" # USED
|
||
```
|
||
WIRED = imported AND used. ORPHANED = exists but not imported/used.
|
||
|
||
| Exists | Substantive | Wired | Status |
|
||
| ------ | ----------- | ----- | ---------- |
|
||
| ✓ | ✓ | ✓ | ✓ VERIFIED |
|
||
| ✓ | ✓ | ✗ | ⚠️ ORPHANED |
|
||
| ✓ | ✗ | - | ✗ STUB |
|
||
| ✗ | - | - | ✗ MISSING |
|
||
|
||
**Export-level spot check (WARNING severity):**
|
||
|
||
For artifacts that pass Level 3, spot-check individual exports:
|
||
- Extract key exported symbols (functions, constants, classes - skip types/interfaces)
|
||
- For each, grep for usage outside the defining file
|
||
- Flag exports with zero external call sites as "exported but unused"
|
||
|
||
This catches dead stores like `setPlan()` that exist in a wired file but are
|
||
never actually called. Report as WARNING - may indicate incomplete cross-plan
|
||
wiring or leftover code from plan revisions.
|
||
</step>
|
||
|
||
<step name="verify_wiring">
|
||
Use gsd-tools for key link verification against must_haves in each PLAN:
|
||
|
||
```bash
|
||
for plan in "$PHASE_DIR"/*-PLAN.md; do
|
||
LINKS_RESULT=$(pi-gsd-tools verify key-links "$plan")
|
||
echo "=== $plan ===" && echo "$LINKS_RESULT"
|
||
done
|
||
```
|
||
|
||
Parse JSON result: `{ all_verified, verified, total, links: [{from, to, via, verified, detail}] }`
|
||
|
||
**Link status from result:**
|
||
- `verified=true` → WIRED
|
||
- `verified=false` with "not found" → NOT_WIRED
|
||
- `verified=false` with "Pattern not found" → PARTIAL
|
||
|
||
**Fallback patterns (if key_links not in must_haves):**
|
||
|
||
| Pattern | Check | Status |
|
||
| --------------- | -------------------------------------------------------------------------------------- | ------------------------------------------------------ |
|
||
| Component → API | fetch/axios call to API path, response used (await/.then/setState) | WIRED / PARTIAL (call but unused response) / NOT_WIRED |
|
||
| API → Database | Prisma/DB query on model, result returned via res.json() | WIRED / PARTIAL (query but not returned) / NOT_WIRED |
|
||
| Form → Handler | onSubmit with real implementation (fetch/axios/mutate/dispatch), not console.log/empty | WIRED / STUB (log-only/empty) / NOT_WIRED |
|
||
| State → Render | useState variable appears in JSX (`{stateVar}` or `{stateVar.property}`) | WIRED / NOT_WIRED |
|
||
|
||
Record status and evidence for each key link.
|
||
</step>
|
||
|
||
<step name="verify_requirements">
|
||
If REQUIREMENTS.md exists:
|
||
```bash
|
||
grep -E "Phase ${PHASE_NUM}" .planning/REQUIREMENTS.md 2>/dev/null || true
|
||
```
|
||
|
||
For each requirement: parse description → identify supporting truths/artifacts → status: ✓ SATISFIED / ✗ BLOCKED / ? NEEDS HUMAN.
|
||
</step>
|
||
|
||
<step name="scan_antipatterns">
|
||
Extract files modified in this phase from SUMMARY.md, scan each:
|
||
|
||
| Pattern | Search | Severity |
|
||
| ------------------- | ------------------------------------------------------------- | --------- |
|
||
| TODO/FIXME/XXX/HACK | `grep -n -E "TODO\|FIXME\|XXX\|HACK"` | ⚠️ Warning |
|
||
| Placeholder content | `grep -n -iE "placeholder\|coming soon\|will be here"` | 🛑 Blocker |
|
||
| Empty returns | `grep -n -E "return null\|return \{\}\|return \[\]\|=> \{\}"` | ⚠️ Warning |
|
||
| Log-only functions | Functions containing only console.log | ⚠️ Warning |
|
||
|
||
Categorize: 🛑 Blocker (prevents goal) | ⚠️ Warning (incomplete) | ℹ️ Info (notable).
|
||
</step>
|
||
|
||
<step name="identify_human_verification">
|
||
**Always needs human:** Visual appearance, user flow completion, real-time behavior (WebSocket/SSE), external service integration, performance feel, error message clarity.
|
||
|
||
**Needs human if uncertain:** Complex wiring grep can't trace, dynamic state-dependent behavior, edge cases.
|
||
|
||
Format each as: Test Name → What to do → Expected result → Why can't verify programmatically.
|
||
</step>
|
||
|
||
<step name="determine_status">
|
||
**passed:** All truths VERIFIED, all artifacts pass levels 1-3, all key links WIRED, no blocker anti-patterns.
|
||
|
||
**gaps_found:** Any truth FAILED, artifact MISSING/STUB, key link NOT_WIRED, or blocker found.
|
||
|
||
**human_needed:** All automated checks pass but human verification items remain.
|
||
|
||
**Score:** `verified_truths / total_truths`
|
||
</step>
|
||
|
||
<step name="generate_fix_plans">
|
||
If gaps_found:
|
||
|
||
1. **Cluster related gaps:** API stub + component unwired → "Wire frontend to backend". Multiple missing → "Complete core implementation". Wiring only → "Connect existing components".
|
||
|
||
2. **Generate plan per cluster:** Objective, 2-3 tasks (files/action/verify each), re-verify step. Keep focused: single concern per plan.
|
||
|
||
3. **Order by dependency:** Fix missing → fix stubs → fix wiring → verify.
|
||
</step>
|
||
|
||
<step name="create_report">
|
||
```bash
|
||
REPORT_PATH="$PHASE_DIR/${PHASE_NUM}-VERIFICATION.md"
|
||
```
|
||
|
||
Fill template sections: frontmatter (phase/timestamp/status/score), goal achievement, artifact table, wiring table, requirements coverage, anti-patterns, human verification, gaps summary, fix plans (if gaps_found), metadata.
|
||
|
||
See .pi/gsd/templates/verification-report.md for complete template.
|
||
</step>
|
||
|
||
<step name="return_to_orchestrator">
|
||
Return status (`passed` | `gaps_found` | `human_needed`), score (N/M must-haves), report path.
|
||
|
||
If gaps_found: list gaps + recommended fix plan names.
|
||
If human_needed: list items requiring human testing.
|
||
|
||
Orchestrator routes: `passed` → update_roadmap | `gaps_found` → create/execute fixes, re-verify | `human_needed` → present to user.
|
||
</step>
|
||
|
||
</process>
|
||
|
||
<success_criteria>
|
||
- [ ] Must-haves established (from frontmatter or derived)
|
||
- [ ] All truths verified with status and evidence
|
||
- [ ] All artifacts checked at all three levels
|
||
- [ ] All key links verified
|
||
- [ ] Requirements coverage assessed (if applicable)
|
||
- [ ] Anti-patterns scanned and categorized
|
||
- [ ] Human verification items identified
|
||
- [ ] Overall status determined
|
||
- [ ] Fix plans generated (if gaps_found)
|
||
- [ ] VERIFICATION.md created with complete report
|
||
- [ ] Results returned to orchestrator
|
||
</success_criteria>
|