Working project status as of today. Separate from the architectural overview; this is the practical "where are we, what works, what's broken, what's next" snapshot updated as each phase ships.
Single channel input, single provider, three-layer Push Gate, audit log. Everything in this diagram is shipped and tested.
What the same request looks like when every phase has shipped. Same trigger column flows through more layers; new always-on processes can inject their own triggers; memory persists across all sessions; multiple worker providers run in parallel; reviewers gate quality before persist.
The delta is the roadmap. Every box in the blue diagram that isn't in the green one is something the self-iteration loop will build, in the order listed below.
Ordered smallest-and-safest first so the self-iteration loop validates itself on easy wins before touching anything load-bearing. The total is roughly 30-35 hours of subagent execution to ship every box in the "full vision" diagram above.
Operator note:
The self-iteration loop also needs three guardrails before it can run unattended:
(a) pnpm dev:safe script that runs the moderator without tsx watch (so worker edits don't trigger mid-task restarts),
(b) prompt guardrail "if tests fail, STOP and report — do not try to fix forward",
(c) per-task git commit + audit log entry so any single iteration is rollbackable.
Run separately from this list once they're in place.
One command, moderator on :3000 + dashboard on :5173, concurrently labeled.
cd ~/projects/agent-orchestrator pnpm dev
Visit http://localhost:5173. You should see the mac-window chrome and a green "live" indicator in the titlebar.
Click the + button in the tab bar. A new agent tab appears. It is empty (no worker spawned yet).
Type in the right-side chat panel. cmd+enter sends. This spawns a worker with your message as the prompt. The Moderator prompt template prepended automatically asks for clarification if needed, otherwise goes straight to work.
Worker output streams into the chat panel. If it spins up a localhost dev server, the middle pane iframes it (live, interactive). If it tries something gated (git push, edit ~/.env, etc.), the bottom bar lights up amber with the matched rule and layer chip.
Click approve or deny on the bottom bar. Decision is logged to audit_log and broadcast back to the worker via WS so it can continue (or handle the denial).
Click the gear icon. Edit the blacklist tab. Add patterns. Save. The loader hot-reloads, the matcher uses new patterns on the next decision.
# ~/.orchestrator/blacklist.yml bash: - "git push *" - "fly deploy *" paths: - "**/.env*" - "~/personal-notes/**"
API-only for now. UI tab coming.
curl http://localhost:3000/audit | jq curl http://localhost:3000/audit?kind=gate_decision&limit=20 | jq
Phases 1-2C give you a tool you can actually point at FBM Sniper for 30 nights. That data tells us which Phase 6 memory features matter and which Part B safety features are needed first. Building Phase 6 before the trial is guessing.
The win isn't the longest feature list. It's being the only tool where you genuinely close the laptop and trust it. Earned by Push Gate + audit log + Pending Learnings reliability, not by shipping every roadmap item.
An orchestrator modifying its own gate logic is the single feature most likely to break the trust pitch catastrophically. Phase 7 stays planned but only for non-safety code, and is not part of the public release.
Pending Learnings, skill provenance, token budgets, and the circuit breaker are all great features. They are also all more value to a tool that has lived through real incidents than to a tool that hasn't. Ship the trial, learn, then build them with that data.