⚠ Coverage limit — The archon stack adapter exposes no cutter:seed-apply seed() implementation, so seeded workflow_runs are not replayable via the pipeline. The before-side will see an empty dashboard (no cards) and correctly mark these steps as absent — that asymmetry is itself evidence the after-side surfaces the new UI.
Summary
archon servestartup unconditionally flipped allrunningworkflow rows tofailedviafailOrphanedRuns()atpackages/server/src/index.ts:213. This killed CLI workflows actively executing in another process. Reproducer: start a workflow in one terminal, start the server in another while it's still running — the workflow's status flips tofailedmid-execution and the CLI exits non-zero even though every node completed successfully. Filed as #1216, discovered during PR #1217 smoke testing.failOrphanedRuns()call from server startup (matches the CLI precedent atpackages/cli/src/cli.ts:256-258). UI gets a numeric count badge on the Dashboard nav (replacing a binary pulse dot) and AlertDialog confirmations for destructive workflow-run actions (replacing 5window.confirm()callsites).failOrphanedRuns()function itself inpackages/core/src/db/workflows.ts:911is preserved — it's still used byarchon workflow cleanup(the explicit user-driven path). Codex provider behavior unchanged. No DB migration. No new dependencies. No timer-based heuristic introduced anywhere — per the new CLAUDE.md principle.UX Journey
Before
After
Architecture Diagram
Before
After
Connection inventory:
archon workflow cleanupretains the linksidebar/ProjectSelector.tsx:142–165patternLabel Snapshot
risk: low(removal of an autonomous mutation; UI changes are additive replacements of an existing pattern)size: M(6 files, +197 / −72; bulk in WorkflowRunCard's 4 dialog conversions)core, server, webserver:index,web:dashboard,web:layoutChange Metadata
bug(primary: fixes #1216) +refactor(secondary: dialog UX)serverLinked Issue
Validation Evidence (required)
End-to-end reproducer (the bug fix verification):
Result with this PR applied:
Workflow completed successfully.(exit 0) ✓orphan/fail_orphans/orphaned_workflow_runs_failedevents ✓status='completed', notfailed✓Without this PR (verified before the fix): Terminal A exits 1 with "Workflow failed", server log emits
db.orphaned_workflow_runs_failed { count: 1 }— exactly the run that was in flight.Regression sweep:
Security Impact (required)
Compatibility / Migration
failOrphanedRuns()retained for explicit cleanup)Behavioral change for operators: Server restarts no longer auto-mark
runningworkflow rows asfailed. Truly orphaned rows from a crashed server now persist asrunninguntil cleaned up viaarchon workflow cleanupor per-row Cancel/Abandon in the dashboard. The Dashboard nav count badge surfaces the count.Human Verification (required)
bun run dev:serverstarts cleanly with no orphan-related log eventsfailOrphanedRuns()function is preserved and still callable by the explicitarchon workflow cleanuppathcreateWorkflowStoreimport in server/index.ts was also removed (caught by TS noUnusedLocals)ConfirmRunActionDialogdoes NOT swallow promise rejections fromonConfirm— errors propagate to the parent'srunActionhelper which already displays them viaactionErrorstatebun testonly coverssrc/lib/andsrc/stores/); adding@testing-library/reactwould be significant scope creep matching no existing pattern. Type-check + lint + manual UI verification + the backend reproducer are the verification levels in this PR.Side Effects / Blast Radius (required)
runningrows from crashed servers will accumulate in the DB until explicit cleanup. The count badge surfaces them; users can click into the dashboard and Cancel per row. This is the intended trade-off per CLAUDE.md "No Autonomous Lifecycle Mutation Across Process Boundaries".listDashboardRuns({ status: 'running', limit: 1 })in TopNav adds one query per 10s wherelistWorkflowRunswas previously called. Same frequency, slightly heavier endpoint (returns enriched run + counts vs raw run array). Thelimit: 1keeps the runs payload trivially small; we only consumecounts.running.archon workflow statusCLI command continues to work and lists running rowsdb.orphaned_workflow_runs_failedlog event is now only emitted by the explicit cleanup path, so its presence post-merge is a useful signal that someone ran cleanup intentionallyRollback Plan (required)
git revert 7a00e047ondev. One commit, atomic. No DB changes to reverse.window.confirm(worse UX but functional)Risks and Mitigations
archon workflow cleanupor the dashboard explicitly. Some may not realize the behavior changed.db.orphaned_workflow_runs_failedevent, removing a false-positive signal.window.confirmelsewhere — ProjectSelector pattern, this PR's pattern, and any I missed).window.confirmin the touched files (WorkflowRunCard.tsx,WorkflowHistoryTable.tsx). Other components inpackages/web/may still usewindow.confirmand should be reviewed in a follow-up sweep — out of scope here.Summary by CodeRabbit
New Features
Changed
Documentation