Files
AGENTS/.pi/gsd/workflows/autonomous.md

698 lines
23 KiB
Markdown
Raw Permalink Normal View History

2026-04-24 20:00:33 +02:00
<gsd-version v="1.12.4" />
<gsd-arguments>
<settings><keep-extra-args /></settings>
<arg name="from" type="number" optional />
<arg name="to" type="number" optional />
</gsd-arguments>
<gsd-execute>
<shell command="pi-gsd-tools">
<args>
<arg string="roadmap" />
<arg string="analyze" />
<arg string="--raw" />
</args>
<outs>
<out type="string" name="roadmap" />
</outs>
</shell>
<shell command="pi-gsd-tools">
<args>
<arg string="state" />
<arg string="json" />
<arg string="--raw" />
</args>
<outs>
<out type="string" name="state" />
</outs>
</shell>
<shell command="pi-gsd-tools">
<args>
<arg string="init" />
<arg string="milestone-op" />
</args>
<outs>
<out type="string" name="milestone-data" />
</outs>
</shell>
<if>
<condition>
<starts-with>
<left name="milestone-data" />
<right type="string" value="@file:" />
</starts-with>
</condition>
<then>
<string-op op="split">
<args>
<arg name="milestone-data" />
<arg type="string" value="@file:" />
</args>
<outs>
<out type="string" name="milestone-data-file" />
</outs>
</string-op>
<shell command="cat">
<args>
<arg name="milestone-data-file" wrap='"' />
</args>
<outs>
<out type="string" name="milestone-data" />
</outs>
</shell>
</then>
</if>
<!-- SET _auto_chain_active when chain starts - cleared by chain completion/pause, not by absence of --auto -->
<shell command="pi-gsd-tools">
<args>
<arg string="config-set" />
<arg string="workflow._auto_chain_active" />
<arg string="true" />
</args>
<outs><suppress-errors /></outs>
</shell>
</gsd-execute>
## Context (pre-injected)
**Roadmap:**
<gsd-paste name="roadmap" />
**State:**
<gsd-paste name="state" />
**Milestone Data:**
<gsd-paste name="milestone-data" />
<purpose>
Drive all remaining milestone phases autonomously. For each incomplete phase: discuss → plan → execute using Skill() flat invocations. Pauses only for explicit user decisions (grey area acceptance, blockers, validation requests). Re-reads ROADMAP.md after each phase to catch dynamically inserted phases.
</purpose>
<required_reading>
Read all files referenced by the invoking prompt's execution_context before starting.
</required_reading>
<process>
<step name="initialize" priority="first">
## 1. Initialize
Parse `$ARGUMENTS` for `--from N` flag:
<!-- Context pre-injected above via WXP - variables available via <gsd-paste name="..."> -->
Parse JSON for: `milestone_version`, `milestone_name`, `phase_count`, `completed_phases`, `roadmap_exists`, `state_exists`, `commit_docs`.
**If `roadmap_exists` is false:** Error - "No ROADMAP.md found. Run `/gsd-new-milestone` first."
**If `state_exists` is false:** Error - "No STATE.md found. Run `/gsd-new-milestone` first."
Display startup banner:
```
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
GSD ► AUTONOMOUS
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Milestone: {milestone_version} - {milestone_name}
Phases: {phase_count} total, {completed_phases} complete
```
If `FROM_PHASE` is set, display: `Starting from phase ${FROM_PHASE}`
</step>
<step name="discover_phases">
## 2. Discover Phases
Run phase discovery:
<!-- Context pre-injected above via WXP - variables available via <gsd-paste name="..."> -->
Parse `has_context` from JSON.
**If has_context is true:** Skip discuss - context already gathered. Display:
```
Phase ${PHASE_NUM}: Context exists - skipping discuss.
```
Proceed to 3b.
**If has_context is false:** Check if discuss is disabled via settings:
<!-- Context pre-injected above via WXP - variables available via <gsd-paste name="..."> -->
Check `has_context`. If false → go to handle_blocker: "Smart discuss for phase ${PHASE_NUM} did not produce CONTEXT.md."
**3a.5. UI Design Contract (Frontend Phases)**
Check if this phase has frontend indicators and whether a UI-SPEC already exists:
<!-- Context pre-injected above via WXP - variables available via <gsd-paste name="..."> -->
Parse `phase_dir` from the JSON.
**If VERIFY_STATUS is empty** (no VERIFICATION.md or no status field):
Go to handle_blocker: "Execute phase ${PHASE_NUM} did not produce verification results."
**If `passed`:**
Display:
```
Phase ${PHASE_NUM} ✅ ${PHASE_NAME} - Verification passed
```
Proceed to iterate step.
**If `human_needed`:**
Read the human_verification section from VERIFICATION.md to get the count and items requiring manual testing.
Display the items, then ask user via AskUserQuestion:
- **question:** "Phase ${PHASE_NUM} has items needing manual verification. Validate now or continue to next phase?"
- **options:** "Validate now" / "Continue without validation"
On **"Validate now"**: Present the specific items from VERIFICATION.md's human_verification section. After user reviews, ask:
- **question:** "Validation result?"
- **options:** "All good - continue" / "Found issues"
On "All good - continue": Display `Phase ${PHASE_NUM} ✅ Human validation passed` and proceed to iterate step.
On "Found issues": Go to handle_blocker with the user's reported issues as the description.
On **"Continue without validation"**: Display `Phase ${PHASE_NUM} ⏭ Human validation deferred` and proceed to iterate step.
**If `gaps_found`:**
Read gap summary from VERIFICATION.md (score and missing items). Display:
```
⚠ Phase ${PHASE_NUM}: ${PHASE_NAME} - Gaps Found
Score: {N}/{M} must-haves verified
```
Ask user via AskUserQuestion:
- **question:** "Gaps found in phase ${PHASE_NUM}. How to proceed?"
- **options:** "Run gap closure" / "Continue without fixing" / "Stop autonomous mode"
On **"Run gap closure"**: Execute gap closure cycle (limit: 1 attempt):
```
Skill(skill="gsd-plan-phase", args="${PHASE_NUM} --gaps")
```
Verify gap plans were created - re-run `init phase-op ${PHASE_NUM}` and check `has_plans`. If no new gap plans → go to handle_blocker: "Gap closure planning for phase ${PHASE_NUM} did not produce plans."
Re-execute:
```
Skill(skill="gsd-execute-phase", args="${PHASE_NUM} --no-transition")
```
Re-read verification status:
<!-- Context pre-injected above via WXP - variables available via <gsd-paste name="..."> -->
Parse from JSON: `phase_dir`, `phase_slug`, `padded_phase`, `phase_name`.
---
### Sub-step 1: Load prior context
Read project-level and prior phase context to avoid re-asking decided questions.
**Read project files:**
```bash
cat .planning/PROJECT.md 2>/dev/null || true
cat .planning/REQUIREMENTS.md 2>/dev/null || true
cat .planning/STATE.md 2>/dev/null || true
```
Extract from these:
- **PROJECT.md** - Vision, principles, non-negotiables, user preferences
- **REQUIREMENTS.md** - Acceptance criteria, constraints, must-haves vs nice-to-haves
- **STATE.md** - Current progress, decisions logged so far
**Read all prior CONTEXT.md files:**
```bash
(find .planning/phases -name "*-CONTEXT.md" 2>/dev/null || true) | sort
```
For each CONTEXT.md where phase number < current phase:
- Read the `<decisions>` section - these are locked preferences
- Read `<specifics>` - particular references or "I want it like X" moments
- Note patterns (e.g., "user consistently prefers minimal UI", "user rejected verbose output")
**Build internal prior_decisions context** (do not write to file):
```
<prior_decisions>
## Project-Level
- [Key principle or constraint from PROJECT.md]
- [Requirement affecting this phase from REQUIREMENTS.md]
## From Prior Phases
### Phase N: [Name]
- [Decision relevant to current phase]
- [Preference that establishes a pattern]
</prior_decisions>
```
If no prior context exists, continue without - expected for early phases.
---
### Sub-step 2: Scout Codebase
Lightweight codebase scan to inform grey area identification and proposals. Keep under ~5% context.
**Check for existing codebase maps:**
```bash
ls .planning/codebase/*.md 2>/dev/null || true
```
**If codebase maps exist:** Read the most relevant ones (CONVENTIONS.md, STRUCTURE.md, STACK.md based on phase type). Extract reusable components, established patterns, integration points. Skip to building context below.
**If no codebase maps, do targeted grep:**
Extract key terms from the phase goal. Search for related files:
```bash
grep -rl "{term1}\|{term2}" src/ app/ --include="*.ts" --include="*.tsx" --include="*.js" --include="*.jsx" 2>/dev/null | head -10 || true
ls src/components/ src/hooks/ src/lib/ src/utils/ 2>/dev/null || true
```
Read the 3-5 most relevant files to understand existing patterns.
**Build internal codebase_context** (do not write to file):
- **Reusable assets** - existing components, hooks, utilities usable in this phase
- **Established patterns** - how the codebase does state management, styling, data fetching
- **Integration points** - where new code connects (routes, nav, providers)
---
### Sub-step 3: Analyze Phase and Generate Proposals
**Get phase details:**
```bash
DETAIL=$(pi-gsd-tools roadmap get-phase ${PHASE_NUM})
```
Extract `goal`, `requirements`, `success_criteria` from the JSON response.
**Infrastructure detection - check FIRST before generating grey areas:**
A phase is pure infrastructure when ALL of these are true:
1. Goal keywords match: "scaffolding", "plumbing", "setup", "configuration", "migration", "refactor", "rename", "restructure", "upgrade", "infrastructure"
2. AND success criteria are all technical: "file exists", "test passes", "config valid", "command runs"
3. AND no user-facing behavior is described (no "users can", "displays", "shows", "presents")
**If infrastructure-only:** Skip Sub-step 4. Jump directly to Sub-step 5 with minimal CONTEXT.md. Display:
```
Phase ${PHASE_NUM}: Infrastructure phase - skipping discuss, writing minimal context.
```
Use these defaults for the CONTEXT.md:
- `<domain>`: Phase boundary from ROADMAP goal
- `<decisions>`: Single "### the agent's Discretion" subsection - "All implementation choices are at the agent's discretion - pure infrastructure phase"
- `<code_context>`: Whatever the codebase scout found
- `<specifics>`: "No specific requirements - infrastructure phase"
- `<deferred>`: "None"
**If NOT infrastructure - generate grey area proposals:**
Determine domain type from the phase goal:
- Something users **SEE** → visual: layout, interactions, states, density
- Something users **CALL** → interface: contracts, responses, errors, auth
- Something users **RUN** → execution: invocation, output, behavior modes, flags
- Something users **READ** → content: structure, tone, depth, flow
- Something being **ORGANIZED** → organization: criteria, grouping, exceptions, naming
Check prior_decisions - skip grey areas already decided in prior phases.
Generate **3-4 grey areas** with **~4 questions each**. For each question:
- **Pre-select a recommended answer** based on: prior decisions (consistency), codebase patterns (reuse), domain conventions (standard approaches), ROADMAP success criteria
- Generate **1-2 alternatives** per question
- **Annotate** with prior decision context ("You decided X in Phase N") and code context ("Component Y exists with Z variants") where relevant
---
### Sub-step 4: Present Proposals Per Area
Present grey areas **one at a time**. For each area (M of N):
Display a table:
```
### Grey Area {M}/{N}: {Area Name}
| # | Question | ✅ Recommended | Alternative(s) |
| --- | ---------- | ---------------------- | -------------- |
| 1 | {question} | {answer} - {rationale} | {alt1}; {alt2} |
| 2 | {question} | {answer} - {rationale} | {alt1} |
| 3 | {question} | {answer} - {rationale} | {alt1}; {alt2} |
| 4 | {question} | {answer} - {rationale} | {alt1} |
```
Then prompt the user via **AskUserQuestion**:
- **header:** "Area {M}/{N}"
- **question:** "Accept these answers for {Area Name}?"
- **options:** Build dynamically - always "Accept all" first, then "Change Q1" through "Change QN" for each question (up to 4), then "Discuss deeper" last. Cap at 6 explicit options max (AskUserQuestion adds "Other" automatically).
**On "Accept all":** Record all recommended answers for this area. Move to next area.
**On "Change QN":** Use AskUserQuestion with the alternatives for that specific question:
- **header:** "{Area Name}"
- **question:** "Q{N}: {question text}"
- **options:** List the 1-2 alternatives plus "You decide" (maps to the agent's Discretion)
Record the user's choice. Re-display the updated table with the change reflected. Re-present the full acceptance prompt so the user can make additional changes or accept.
**On "Discuss deeper":** Switch to interactive mode for this area only - ask questions one at a time using AskUserQuestion with 2-3 concrete options per question plus "You decide". After 4 questions, prompt:
- **header:** "{Area Name}"
- **question:** "More questions about {area name}, or move to next?"
- **options:** "More questions" / "Next area"
If "More questions", ask 4 more. If "Next area", display final summary table of captured answers for this area and move on.
**On "Other" (free text):** Interpret as either a specific change request or general feedback. Incorporate into the area's decisions, re-display updated table, re-present acceptance prompt.
**Scope creep handling:** If user mentions something outside the phase domain:
```
"{Feature} sounds like a new capability - that belongs in its own phase.
I'll note it as a deferred idea.
Back to {current area}: {return to current question}"
```
Track deferred ideas internally for inclusion in CONTEXT.md.
---
### Sub-step 5: Write CONTEXT.md
After all areas are resolved (or infrastructure skip), write the CONTEXT.md file.
**File path:** `${phase_dir}/${padded_phase}-CONTEXT.md`
Use **exactly** this structure (identical to discuss-phase output):
```markdown
# Phase {PHASE_NUM}: {Phase Name} - Context
**Gathered:** {date}
**Status:** Ready for planning
<domain>
## Phase Boundary
{Domain boundary statement from analysis - what this phase delivers}
</domain>
<decisions>
## Implementation Decisions
### {Area 1 Name}
- {Accepted/chosen answer for Q1}
- {Accepted/chosen answer for Q2}
- {Accepted/chosen answer for Q3}
- {Accepted/chosen answer for Q4}
### {Area 2 Name}
- {Accepted/chosen answer for Q1}
- {Accepted/chosen answer for Q2}
...
### the agent's Discretion
{Any "You decide" answers collected - note the agent has flexibility here}
</decisions>
<code_context>
## Existing Code Insights
### Reusable Assets
- {From codebase scout - components, hooks, utilities}
### Established Patterns
- {From codebase scout - state management, styling, data fetching}
### Integration Points
- {From codebase scout - where new code connects}
</code_context>
<specifics>
## Specific Ideas
{Any specific references or "I want it like X" from discussion}
{If none: "No specific requirements - open to standard approaches"}
</specifics>
<deferred>
## Deferred Ideas
{Ideas captured but out of scope for this phase}
{If none: "None - discussion stayed within phase scope"}
</deferred>
```
Write the file.
**Commit:**
```bash
pi-gsd-tools commit "docs(${PADDED_PHASE}): smart discuss context" --files "${phase_dir}/${padded_phase}-CONTEXT.md"
```
Display confirmation:
```
Created: {path}
Decisions captured: {count} across {area_count} areas
```
</step>
<step name="iterate">
## 4. Iterate
After each phase completes, re-read ROADMAP.md to catch phases inserted mid-execution (decimal phases like 5.1):
```bash
ROADMAP=$(pi-gsd-tools roadmap analyze)
```
Re-filter incomplete phases using the same logic as discover_phases:
- Keep phases where `disk_status !== "complete"` OR `roadmap_complete === false`
- Apply `--from N` filter if originally provided
- Sort by number ascending
Read STATE.md fresh:
```bash
cat .planning/STATE.md
```
Check for blockers in the Blockers/Concerns section. If blockers are found, go to handle_blocker with the blocker description.
If incomplete phases remain: proceed to next phase, loop back to execute_phase.
If all phases complete, proceed to lifecycle step.
</step>
<step name="lifecycle">
## 5. Lifecycle
After all phases complete, run the milestone lifecycle sequence: audit → complete → cleanup.
Display lifecycle transition banner:
```
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
GSD ► AUTONOMOUS ▸ LIFECYCLE
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
All phases complete → Starting lifecycle: audit → complete → cleanup
Milestone: {milestone_version} - {milestone_name}
```
**5a. Audit**
```
Skill(skill="gsd-audit-milestone")
```
After audit completes, detect the result:
```bash
AUDIT_FILE=".planning/v${milestone_version}-MILESTONE-AUDIT.md"
AUDIT_STATUS=$(grep "^status:" "${AUDIT_FILE}" 2>/dev/null | head -1 | cut -d: -f2 | tr -d ' ')
```
**If AUDIT_STATUS is empty** (no audit file or no status field):
Go to handle_blocker: "Audit did not produce results - audit file missing or malformed."
**If `passed`:**
Display:
```
Audit ✅ passed - proceeding to complete milestone
```
Proceed to 5b (no user pause - per CTRL-01).
**If `gaps_found`:**
Read the gaps summary from the audit file. Display:
```
⚠ Audit: Gaps Found
```
Ask user via AskUserQuestion:
- **question:** "Milestone audit found gaps. How to proceed?"
- **options:** "Continue anyway - accept gaps" / "Stop - fix gaps manually"
On **"Continue anyway"**: Display `Audit ⏭ Gaps accepted - proceeding to complete milestone` and proceed to 5b.
On **"Stop"**: Go to handle_blocker with "User stopped - audit gaps remain. Run /gsd-audit-milestone to review, then /gsd-complete-milestone when ready."
**If `tech_debt`:**
Read the tech debt summary from the audit file. Display:
```
⚠ Audit: Tech Debt Identified
```
Show the summary, then ask user via AskUserQuestion:
- **question:** "Milestone audit found tech debt. How to proceed?"
- **options:** "Continue with tech debt" / "Stop - address debt first"
On **"Continue with tech debt"**: Display `Audit ⏭ Tech debt acknowledged - proceeding to complete milestone` and proceed to 5b.
On **"Stop"**: Go to handle_blocker with "User stopped - tech debt to address. Run /gsd-audit-milestone to review details."
**5b. Complete Milestone**
```
Skill(skill="gsd-complete-milestone", args="${milestone_version}")
```
After complete-milestone returns, verify it produced output:
```bash
ls .planning/milestones/v${milestone_version}-ROADMAP.md 2>/dev/null || true
```
If the archive file does not exist, go to handle_blocker: "Complete milestone did not produce expected archive files."
**5c. Cleanup**
```
Skill(skill="gsd-cleanup")
```
Cleanup shows its own dry-run and asks user for approval internally - this is an acceptable pause per CTRL-01 since it's an explicit decision about file deletion.
**5d. Final Completion**
Display final completion banner:
```
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
GSD ► AUTONOMOUS ▸ COMPLETE 🎉
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Milestone: {milestone_version} - {milestone_name}
Status: Complete ✅
Lifecycle: audit ✅ → complete ✅ → cleanup ✅
Ship it! 🚀
```
</step>
<step name="handle_blocker">
## 6. Handle Blocker
When any phase operation fails or a blocker is detected, present 3 options via AskUserQuestion:
**Prompt:** "Phase {N} ({Name}) encountered an issue: {description}"
**Options:**
1. **"Fix and retry"** - Re-run the failed step (discuss, plan, or execute) for this phase
2. **"Skip this phase"** - Mark phase as skipped, continue to the next incomplete phase
3. **"Stop autonomous mode"** - Display summary of progress so far and exit cleanly
**On "Fix and retry":** Loop back to the failed step within execute_phase. If the same step fails again after retry, re-present these options.
**On "Skip this phase":** Log `Phase {N} ⏭ {Name} - Skipped by user` and proceed to iterate.
**On "Stop autonomous mode":** Display progress summary:
```
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
GSD ► AUTONOMOUS ▸ STOPPED
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Completed: {list of completed phases}
Skipped: {list of skipped phases}
Remaining: {list of remaining phases}
Resume with: /gsd-autonomous --from {next_phase}
```
</step>
</process>
<success_criteria>
- [ ] All incomplete phases executed in order (smart discuss → ui-phase → plan → execute → ui-review each)
- [ ] Smart discuss proposes grey area answers in tables, user accepts or overrides per area
- [ ] Progress banners displayed between phases
- [ ] Execute-phase invoked with --no-transition (autonomous manages transitions)
- [ ] Post-execution verification reads VERIFICATION.md and routes on status
- [ ] Passed verification → automatic continue to next phase
- [ ] Human-needed verification → user prompted to validate or skip
- [ ] Gaps-found → user offered gap closure, continue, or stop
- [ ] Gap closure limited to 1 retry (prevents infinite loops)
- [ ] Plan-phase and execute-phase failures route to handle_blocker
- [ ] ROADMAP.md re-read after each phase (catches inserted phases)
- [ ] STATE.md checked for blockers before each phase
- [ ] Blockers handled via user choice (retry / skip / stop)
- [ ] Final completion or stop summary displayed
- [ ] After all phases complete, lifecycle step is invoked (not manual suggestion)
- [ ] Lifecycle transition banner displayed before audit
- [ ] Audit invoked via Skill(skill="gsd-audit-milestone")
- [ ] Audit result routing: passed → auto-continue, gaps_found → user decides, tech_debt → user decides
- [ ] Audit technical failure (no file/no status) routes to handle_blocker
- [ ] Complete-milestone invoked via Skill() with ${milestone_version} arg
- [ ] Cleanup invoked via Skill() - internal confirmation is acceptable (CTRL-01)
- [ ] Final completion banner displayed after lifecycle
- [ ] Progress bar uses phase number / total milestone phases (not position among incomplete)
- [ ] Smart discuss documents relationship to discuss-phase with CTRL-03 note
- [ ] Frontend phases get UI-SPEC generated before planning (step 3a.5) if not already present
- [ ] Frontend phases get UI review audit after successful execution (step 3d.5) if UI-SPEC exists
- [ ] UI phase and UI review respect workflow.ui_phase and workflow.ui_review config toggles
- [ ] UI review is advisory (non-blocking) - phase proceeds to iterate regardless of score
</success_criteria>