Files
AGENTS/.pi/gsd/workflows/execute-plan.md
2026-04-24 20:00:33 +02:00

24 KiB

Execution Context (pre-injected by WXP)

Phase:

Phase Init Data:

Load execution context (paths only to minimize orchestrator context):

Extract from init JSON: executor_model, commit_docs, sub_repos, phase_dir, phase_number, plans, summaries, incomplete_plans, state_path, config_path.

If .planning/ missing: error.

```bash # Use plans/summaries from INIT JSON, or list files (ls .planning/phases/XX-name/*-PLAN.md 2>/dev/null || true) | sort (ls .planning/phases/XX-name/*-SUMMARY.md 2>/dev/null || true) | sort ```

Find first PLAN without matching SUMMARY. Decimal phases supported (01.1-hotfix/):

PHASE=$(echo "$PLAN_PATH" | grep -oE '[0-9]+(\.[0-9]+)?-[0-9]+')
# config settings can be fetched via gsd-tools config-get if needed
Auto-approve: ` Execute {phase}-{plan}-PLAN.md [Plan X of Y for Phase Z]` → parse_segments. Present plan identification, wait for confirmation. ```bash PLAN_START_TIME=$(date -u +"%Y-%m-%dT%H:%M:%SZ") PLAN_START_EPOCH=$(date +%s) ``` ```bash grep -n "type=\"checkpoint" .planning/phases/XX-name/{phase}-{plan}-PLAN.md ```

Routing by checkpoint type:

Checkpoints Pattern Execution
None A (autonomous) Single subagent: full plan + SUMMARY + commit
Verify-only B (segmented) Segments between checkpoints. After none/human-verify → SUBAGENT. After decision/human-action → MAIN
Decision C (main) Execute entirely in main context

Pattern A: init_agent_tracking → spawn Task(subagent_type="gsd-executor", model=executor_model, isolation="worktree") with prompt: execute plan at [path], autonomous, all tasks + SUMMARY + commit, follow deviation/auth rules, report: plan name, tasks, SUMMARY path, commit hash → track agent_id → wait → update tracking → report.

Pattern B: Execute segment-by-segment. Autonomous segments: spawn subagent for assigned tasks only (no SUMMARY/commit). Checkpoints: main context. After all segments: aggregate, create SUMMARY, commit. See segment_execution.

Pattern C: Execute in main using standard flow (step name="execute").

Fresh context per subagent preserves peak quality. Main context stays lean.

```bash if [ ! -f .planning/agent-history.json ]; then echo '{"version":"1.0","max_entries":50,"entries":[]}' > .planning/agent-history.json fi rm -f .planning/current-agent-id.txt if [ -f .planning/current-agent-id.txt ]; then INTERRUPTED_ID=$(cat .planning/current-agent-id.txt) echo "Found interrupted agent: $INTERRUPTED_ID" fi ```

If interrupted: ask user to resume (Task resume parameter) or start fresh.

Tracking protocol: On spawn: write agent_id to current-agent-id.txt, append to agent-history.json: {"agent_id":"[id]","task_description":"[desc]","phase":"[phase]","plan":"[plan]","segment":[num|null],"timestamp":"[ISO]","status":"spawned","completion_timestamp":null}. On completion: status → "completed", set completion_timestamp, delete current-agent-id.txt. Prune: if entries > max_entries, remove oldest "completed" (never "spawned").

Run for Pattern A/B before spawning. Pattern C: skip.

Pattern B only (verify-only checkpoints). Skip for A/C.
  1. Parse segment map: checkpoint locations and types

  2. Per segment:

    • Subagent route: spawn gsd-executor for assigned tasks only. Prompt: task range, plan path, read full plan for context, execute assigned tasks, track deviations, NO SUMMARY/commit. Track via agent protocol.
    • Main route: execute tasks using standard flow (step name="execute")
  3. After ALL segments: aggregate files/deviations/decisions → create SUMMARY.md → commit → self-check:

    • Verify key-files.created exist on disk with [ -f ]
    • Check git log --oneline --all --grep="{phase}-{plan}" returns ≥1 commit
    • Append ## Self-Check: PASSED or ## Self-Check: FAILED to SUMMARY

    Known Claude Code bug (classifyHandoffIfNeeded): If any segment agent reports "failed" with classifyHandoffIfNeeded is not defined, this is a Claude Code runtime bug - not a real failure. Run spot-checks; if they pass, treat as successful.

```bash cat .planning/phases/XX-name/{phase}-{plan}-PLAN.md ``` This IS the execution instructions. Follow exactly. If plan references CONTEXT.md: honor user's vision throughout.

If plan contains <interfaces> block: These are pre-extracted type definitions and contracts. Use them directly - do NOT re-read the source files to discover types. The planner already extracted what you need.

```bash pi-gsd-tools phases list --type summaries --raw # Extract the second-to-last summary from the JSON result ``` If previous SUMMARY has unresolved "Issues Encountered" or "Next Phase Readiness" blockers: AskUserQuestion(header="Previous Issues", options: "Proceed anyway" | "Address first" | "Review previous"). Deviations are normal - handle via rules below.
  1. Read @context files from prompt
  2. MCP tools: If GEMINI.md or project instructions reference MCP tools (e.g. jCodeMunch for code navigation), prefer them over Grep/Glob when available. Fall back to Grep/Glob if MCP tools are not accessible.
  3. Per task:
    • MANDATORY read_first gate: If the task has a <read_first> field, you MUST read every listed file BEFORE making any edits. This is not optional. Do not skip files because you "already know" what's in them - read them. The read_first files establish ground truth for the task.
    • type="auto": if tdd="true" → TDD execution. Implement with deviation rules + auth gates. Verify done criteria. Commit (see task_commit). Track hash for Summary.
    • type="checkpoint:*": STOP → checkpoint_protocol → wait for user → continue only after confirmation.
    • MANDATORY acceptance_criteria check: After completing each task, if it has <acceptance_criteria>, verify EVERY criterion before moving to the next task. Use grep, file reads, or CLI commands to confirm each criterion. If any criterion fails, fix the implementation before proceeding. Do not skip criteria or mark them as "will verify later".
  4. Run <verification> checks
  5. Confirm <success_criteria> met
  6. Document deviations in Summary

<authentication_gates>

Authentication Gates

Auth errors during execution are NOT failures - they're expected interaction points.

Indicators: "Not authenticated", "Unauthorized", 401/403, "Please run {tool} login", "Set {ENV_VAR}"

Protocol:

  1. Recognize auth gate (not a bug)
  2. STOP task execution
  3. Create dynamic checkpoint:human-action with exact auth steps
  4. Wait for user to authenticate
  5. Verify credentials work
  6. Retry original task
  7. Continue normally

Example: vercel --yes → "Not authenticated" → checkpoint asking user to vercel login → verify with vercel whoami → retry deploy → continue

In Summary: Document as normal flow under "## Authentication Gates", not as deviations.

</authentication_gates>

<deviation_rules>

Deviation Rules

You WILL discover unplanned work. Apply automatically, track all for Summary.

Rule Trigger Action Permission
1: Bug Broken behavior, errors, wrong queries, type errors, security vulns, race conditions, leaks Fix → test → verify → track [Rule 1 - Bug] Auto
2: Missing Critical Missing essentials: error handling, validation, auth, CSRF/CORS, rate limiting, indexes, logging Add → test → verify → track [Rule 2 - Missing Critical] Auto
3: Blocking Prevents completion: missing deps, wrong types, broken imports, missing env/config/files, circular deps Fix blocker → verify proceeds → track [Rule 3 - Blocking] Auto
4: Architectural Structural change: new DB table, schema change, new service, switching libs, breaking API, new infra STOP → present decision (below) → track [Rule 4 - Architectural] Ask user

Rule 4 format:

⚠️ Architectural Decision Needed

Current task: [task name]
Discovery: [what prompted this]
Proposed change: [modification]
Why needed: [rationale]
Impact: [what this affects]
Alternatives: [other approaches]

Proceed with proposed change? (yes / different approach / defer)

Priority: Rule 4 (STOP) > Rules 1-3 (auto) > unsure → Rule 4 Edge cases: missing validation → R2 | null crash → R1 | new table → R4 | new column → R1/2 Heuristic: Affects correctness/security/completion? → R1-3. Maybe? → R4.

</deviation_rules>

<deviation_documentation>

Documenting Deviations

Summary MUST include deviations section. None? → ## Deviations from Plan\n\nNone - plan executed exactly as written.

Per deviation: [Rule N - Category] Title - Found during: Task X | Issue | Fix | Files modified | Verification | Commit hash

End with: Total deviations: N auto-fixed (breakdown). Impact: assessment.

</deviation_documentation>

<tdd_plan_execution>

TDD Execution

For type: tdd plans - RED-GREEN-REFACTOR:

  1. Infrastructure (first TDD plan only): detect project, install framework, config, verify empty suite
  2. RED: Read <behavior> → failing test(s) → run (MUST fail) → commit: test({phase}-{plan}): add failing test for [feature]
  3. GREEN: Read <implementation> → minimal code → run (MUST pass) → commit: feat({phase}-{plan}): implement [feature]
  4. REFACTOR: Clean up → tests MUST pass → commit: refactor({phase}-{plan}): clean up [feature]

Errors: RED doesn't fail → investigate test/existing feature. GREEN doesn't pass → debug, iterate. REFACTOR breaks → undo.

See .pi/gsd/references/tdd.md for structure. </tdd_plan_execution>

<precommit_failure_handling>

Pre-commit Hook Failure Handling

Your commits may trigger pre-commit hooks. Auto-fix hooks handle themselves transparently - files get fixed and re-staged automatically.

If running as a parallel executor agent (spawned by execute-phase): Use --no-verify on all commits. Pre-commit hooks cause build lock contention when multiple agents commit simultaneously (e.g., cargo lock fights in Rust projects). The orchestrator validates once after all agents complete.

If running as the sole executor (sequential mode): If a commit is BLOCKED by a hook:

  1. The git commit command fails with hook error output
  2. Read the error - it tells you exactly which hook and what failed
  3. Fix the issue (type error, lint violation, secret leak, etc.)
  4. git add the fixed files
  5. Retry the commit
  6. Budget 1-2 retry cycles per commit </precommit_failure_handling>

<task_commit>

Task Commit Protocol

After each task (verification passed, done criteria met), commit immediately.

1. Check: git status --short

2. Stage individually (NEVER git add . or git add -A):

git add src/api/auth.ts
git add src/types/user.ts

3. Commit type:

Type When Example
feat New functionality feat(08-02): create user registration endpoint
fix Bug fix fix(08-02): correct email validation regex
test Test-only (TDD RED) test(08-02): add failing test for password hashing
refactor No behavior change (TDD REFACTOR) refactor(08-02): extract validation to helper
perf Performance perf(08-02): add database index
docs Documentation docs(08-02): add API docs
style Formatting style(08-02): format auth module
chore Config/deps chore(08-02): add bcrypt dependency

4. Format: {type}({phase}-{plan}): {description} with bullet points for key changes.

<sub_repos_commit_flow> Sub-repos mode: If sub_repos is configured (non-empty array from init context), use commit-to-subrepo instead of standard git commit. This routes files to their correct sub-repo based on path prefix.

pi-gsd-tools commit-to-subrepo "{type}({phase}-{plan}): {description}" --files file1 file2 ...

The command groups files by sub-repo prefix and commits atomically to each. Returns JSON: { committed: true, repos: { "backend": { hash: "abc", files: [...] }, ... } }.

Record hashes from each repo in the response for SUMMARY tracking.

If sub_repos is empty or not set: Use standard git commit flow below. </sub_repos_commit_flow>

5. Record hash:

TASK_COMMIT=$(git rev-parse --short HEAD)
TASK_COMMITS+=("Task ${TASK_NUM}: ${TASK_COMMIT}")

6. Check for untracked generated files:

git status --short | grep '^??'

If new untracked files appeared after running scripts or tools, decide for each:

  • Commit it - if it's a source file, config, or intentional artifact
  • Add to .gitignore - if it's a generated/runtime output (build artifacts, .env files, cache files, compiled output)
  • Do NOT leave generated files untracked

</task_commit>

On `type="checkpoint:*"`: automate everything possible first. Checkpoints are for verification/decisions only.

Display: CHECKPOINT: [Type] box → Progress {X}/{Y} → Task name → type-specific content → YOUR ACTION: [signal]

Type Content Resume signal
human-verify (90%) What was built + verification steps (commands/URLs) "approved" or describe issues
decision (9%) Decision needed + context + options with pros/cons "Select: option-id"
human-action (1%) What was automated + ONE manual step + verification plan "done"

After response: verify if specified. Pass → continue. Fail → inform, wait. WAIT for user - do NOT hallucinate completion.

See .pi/gsd/references/checkpoints.md for details.

When spawned via Task and hitting checkpoint: return structured state (cannot interact with user directly).

Required return: 1) Completed Tasks table (hashes + files) 2) Current Task (what's blocking) 3) Checkpoint Details (user-facing content) 4) Awaiting (what's needed from user)

Orchestrator parses → presents to user → spawns fresh continuation with your completed tasks state. You will NOT be resumed. In main context: use checkpoint_protocol above.

If verification fails:

Check if node repair is enabled (default: on):

NODE_REPAIR=$(pi-gsd-tools config-get workflow.node_repair 2>/dev/null || echo "true")

If NODE_REPAIR is true: invoke @./.pi/gsd/workflows/node-repair.md with:

  • FAILED_TASK: task number, name, done-criteria
  • ERROR: expected vs actual result
  • PLAN_CONTEXT: adjacent task names + phase goal
  • REPAIR_BUDGET: workflow.node_repair_budget from config (default: 2)

Node repair will attempt RETRY, DECOMPOSE, or PRUNE autonomously. Only reaches this gate again if repair budget is exhausted (ESCALATE).

If NODE_REPAIR is false OR repair returns ESCALATE: STOP. Present: "Verification failed for Task [X]: [name]. Expected: [criteria]. Actual: [result]. Repair attempted: [summary of what was tried]." Options: Retry | Skip (mark incomplete) | Stop (investigate). If skipped → SUMMARY "Issues Encountered".

```bash PLAN_END_TIME=$(date -u +"%Y-%m-%dT%H:%M:%SZ") PLAN_END_EPOCH=$(date +%s)

DURATION_SEC=$(( PLAN_END_EPOCH - PLAN_START_EPOCH )) DURATION_MIN=$(( DURATION_SEC / 60 ))

if $DURATION_MIN -ge 60 ; then HRS=$(( DURATION_MIN / 60 )) MIN=$(( DURATION_MIN % 60 )) DURATION="${HRS}h ${MIN}m" else DURATION="${DURATION_MIN} min" fi

</step>

<step name="generate_user_setup">
```bash
grep -A 50 "^user_setup:" .planning/phases/XX-name/{phase}-{plan}-PLAN.md | head -50

If user_setup exists: create {phase}-USER-SETUP.md using template .pi/gsd/templates/user-setup.md. Per service: env vars table, account setup checklist, dashboard config, local dev notes, verification commands. Status "Incomplete". Set USER_SETUP_CREATED=true. If empty/missing: skip.

Create `{phase}-{plan}-SUMMARY.md` at `.planning/phases/XX-name/`. Use `.pi/gsd/templates/summary.md`.

Frontmatter: phase, plan, subsystem, tags | requires/provides/affects | tech-stack.added/patterns | key-files.created/modified | key-decisions | requirements-completed (MUST copy requirements array from PLAN.md frontmatter verbatim) | duration ($DURATION), completed ($PLAN_END_TIME date).

Title: # Phase [X] Plan [Y]: [Name] Summary

One-liner SUBSTANTIVE: "JWT auth with refresh rotation using jose library" not "Authentication implemented"

Include: duration, start/end times, task count, file count.

Next: more plans → "Ready for {next-plan}" | last → "Phase complete, ready for next step".

Update STATE.md using gsd-tools:
# Advance plan counter (handles last-plan edge case)
pi-gsd-tools state advance-plan

# Recalculate progress bar from disk state
pi-gsd-tools state update-progress

# Record execution metrics
pi-gsd-tools state record-metric \
  --phase "${PHASE}" --plan "${PLAN}" --duration "${DURATION}" \
  --tasks "${TASK_COUNT}" --files "${FILE_COUNT}"
From SUMMARY: Extract decisions and add to STATE.md:
# Add each decision from SUMMARY key-decisions
# Prefer file inputs for shell-safe text (preserves `$`, `*`, etc. exactly)
pi-gsd-tools state add-decision \
  --phase "${PHASE}" --summary-file "${DECISION_TEXT_FILE}" --rationale-file "${RATIONALE_FILE}"

# Add blockers if any found
pi-gsd-tools state add-blocker --text-file "${BLOCKER_TEXT_FILE}"
Update session info using gsd-tools:
pi-gsd-tools state record-session \
  --stopped-at "Completed ${PHASE}-${PLAN}-PLAN.md" \
  --resume-file "None"

Keep STATE.md under 150 lines.

If SUMMARY "Issues Encountered" ≠ "None": yolo → log and continue. Interactive → present issues, wait for acknowledgment. ```bash pi-gsd-tools roadmap update-plan-progress "${PHASE}" ``` Counts PLAN vs SUMMARY files on disk. Updates progress table row with correct count and status (`In Progress` or `Complete` with date). Mark completed requirements from the PLAN.md frontmatter `requirements:` field:
pi-gsd-tools requirements mark-complete ${REQ_IDS}

Extract requirement IDs from the plan's frontmatter (e.g., requirements: [AUTH-01, AUTH-02]). If no requirements field, skip.

Task code already committed per-task. Commit plan metadata:
pi-gsd-tools commit "docs({phase}-{plan}): complete [plan-name] plan" --files .planning/phases/XX-name/{phase}-{plan}-SUMMARY.md .planning/STATE.md .planning/ROADMAP.md .planning/REQUIREMENTS.md
If .planning/codebase/ doesn't exist: skip.
FIRST_TASK=$(git log --oneline --grep="feat({phase}-{plan}):" --grep="fix({phase}-{plan}):" --grep="test({phase}-{plan}):" --reverse | head -1 | cut -d' ' -f1)
git diff --name-only ${FIRST_TASK}^..HEAD 2>/dev/null || true

Update only structural changes: new src/ dir → STRUCTURE.md | deps → STACK.md | file pattern → CONVENTIONS.md | API client → INTEGRATIONS.md | config → STACK.md | renamed → update paths. Skip code-only/bugfix/content changes.

pi-gsd-tools commit "" --files .planning/codebase/*.md --amend
If `USER_SETUP_CREATED=true`: display `⚠️ USER SETUP REQUIRED` with path + env/config tasks at TOP.
(ls -1 .planning/phases/[current-phase-dir]/*-PLAN.md 2>/dev/null || true) | wc -l
(ls -1 .planning/phases/[current-phase-dir]/*-SUMMARY.md 2>/dev/null || true) | wc -l
Condition Route Action
summaries < plans A: More plans Find next PLAN without SUMMARY. Yolo: auto-continue. Interactive: show next plan, suggest /gsd-execute-phase {phase} + /gsd-verify-work. STOP here.
summaries = plans, current < highest phase B: Phase done Show completion, suggest /gsd-plan-phase {Z+1} + /gsd-verify-work {Z} + /gsd-discuss-phase {Z+1}
summaries = plans, current = highest phase C: Milestone done Show banner, suggest /gsd-complete-milestone + /gsd-verify-work + /gsd-add-phase

All routes: /new first for fresh context.

<success_criteria>

  • All tasks from PLAN.md completed
  • All verifications pass
  • USER-SETUP.md generated if user_setup in frontmatter
  • SUMMARY.md created with substantive content
  • STATE.md updated (position, decisions, issues, session)
  • ROADMAP.md updated
  • If codebase map exists: map updated with execution changes (or skipped if no significant changes)
  • If USER-SETUP.md created: prominently surfaced in completion output </success_criteria>