Local-first. Bring your own Claude or Codex. Push Gate-protected. Built for solo developers and small teams who need an always-on engineer that doesn't go rogue, doesn't burn credits unsupervised, and doesn't ask permission for things it should just handle.
The first user is a solo operator running a Facebook Marketplace bot (FBM Sniper) across multiple repos, build pipelines, and customer machines. Deploy errors, CI failures, and runtime crashes happen on no particular schedule and at no particular hour. This orchestrator is the on-call engineer that handles the boring 80%, escalates the interesting 20%, and never accidentally force-pushes to main.
Default Claude Code stops every 30 seconds to confirm trivial decisions. Productivity dies to interruption tax.
Bypass-permissions mode is faster but it will eventually rm-rf your repo, force-push to main, or burn a hundred dollars in tokens before you notice.
Single agent, single thread, single line of attention. No critic, no reviewer, no second pair of eyes catching dumb mistakes.
Reads operator intent, asks any clarifying questions upfront (especially for UI scope or ambiguous goals), then goes silent and runs autonomously to completion. Front-loads ambiguity instead of interrupting mid-task.
The agent can edit files, run commands, fetch the web. Your blacklist.yml flags specific paths and commands that need approval. A hardcoded floor blocks true secrets regardless of config.
Independent tasks run in parallel as isolated worker subprocesses. Each tab in the dashboard is a live agent. Output streams in real time.
Nous Research's Hermes Agent shipped the best agent-memory design we have seen: bounded, frozen mid-session, prefix-cache friendly, and curated by the agent itself. It works beautifully for single-developer, single-machine, single-project use. We kept the parts that work, fixed the layers that break the moment you have more than one repo or more than one machine, and added two layers specific to coding agents.
MEMORY.md (2,200 char cap) and USER.md (1,375 char cap) load into the system prompt at session start and stay immutable until the next session. Total persistent overhead under 1,300 tokens. This is what makes prefix caching work and what forces curation instead of context-stuffing.
Hermes stores at ~/.hermes/ globally, so working on two projects fights for the 2,200 char budget and entries from one repo leak into another. We add PROJECT.md scoped per-repo (interops with existing CLAUDE.md). Multi-machine sync ships in Pro Cloud. Team namespacing in Team tier.
memory(action, scope, content) with actions add / replace / remove and scopes core / user / project. Substring matching for entries, same as Hermes. No read action because core memory is auto-injected.
session_search(query) for FTS over past sessions. Returns ranked snippets, not full transcripts. Cheap to call.
skill(action, name) for procedural memory. save captures a successful sequence; invoke replays it; list shows what is available.
Capacity warnings fire at 80% fullness on any bounded layer, prompting consolidation. Decay archives entries unused after 30 sessions. Per-repo memory archives with the repo when the project is removed, so old context never haunts new work.
A Hono HTTP+WebSocket server (the Moderator) owns a SQLite database and a pool of spawned Claude Code subprocesses (the Workers). Workers receive a permission MCP config that round-trips every sensitive tool call back to the Moderator for the Push Gate to evaluate. The operator watches everything live through a WebSocket-backed dashboard.
What a single request looks like when every phase is shipped. Left to right: a trigger fires (from any channel), the router classifies it, the Moderator clarifies and decomposes against live memory, workers execute in parallel across providers, reviewers verify, and the Memory Keeper persists everything that should outlive the session. Always-on processes (Watchdog, Scheduler, Error Ingest) inject new triggers back into the flow without operator action.
Vague natural-language request from dashboard chat, Discord DM, or HTTP API. No need to specify files, commands, or steps.
Reads the intent against repo context (CLAUDE.md, recent commits, memory layers). Asks any clarifying questions upfront, especially around UI scope, ambiguous goals, or unfamiliar repos. Once aligned, goes silent: produces a task list and decides one worker or several. No further interruptions until done or genuinely blocked.
Worker Manager spawns one Claude Code subprocess per task with a per-worker MCP config baking in the session and agent IDs. Output streams as JSON over stdout, parsed line-by-line.
Worker wants to Bash, Write, Edit, WebFetch, or Read a sensitive path. Permission MCP intercepts and POSTs to /internal/can-use-tool on the Moderator.
Matcher checks the operation against your blacklist.yml and the hardcoded floor. Most things pass through instantly. Only paths or commands that match a blacklist pattern queue to push_gate_queue and block the worker. WebSocket event fires to your dashboard or phone.
Dashboard shows a toast notification with the exact command and rule that matched. Approve or deny with one tap. Decision is written to DB, signal sent back to the waiting worker.
Approved tool call returns "allow" and the worker proceeds. When the worker finishes, a Critic agent reviews the diff (and for UI work, a Vision Critic screenshots the result and iterates).
Session status flips to succeeded. Dashboard updates live. Watchdog continues monitoring for runtime errors, CI failures, or drift signals that warrant a new task.
The Push Gate is not a deny-by-default trap door. The default is permissive: the agent can edit files, run shell commands, fetch the web, and use the system the way you would. You maintain a small blacklist of paths and command patterns that always require your approval, and there is a hardcoded floor of true-secret paths that nothing can override regardless of your config. Everything else runs at full speed without round-tripping for approval.
No round trip, no config, no questions. These cannot modify state or exfiltrate secrets.
Run by default. Round trips only when the target matches a pattern in your blacklist.yml.
Always gated. Cannot be overridden by config. Protects you from your own typos.
Lives at ~/.orchestrator/blacklist.yml globally, with per-repo overrides at <repo>/.orchestrator/blacklist.yml. Pattern syntax is glob for paths, shell-style for bash. Anything not listed runs without prompting.
# ~/.orchestrator/blacklist.yml # Tier 2 operations are free by default. List patterns here that # should ALWAYS require your approval, even though the file or # command would otherwise be permitted. paths: # File writes / edits that need approval - "**/.env*" - "**/.git/config" - "**/package.json" # prevent silent dep bumps - "~/personal-notes/**" # private journal - "~/projects/fbm-sniper/pro/src/license/**" bash: # Commands that need approval - "git push *" # memory:no_push_default - "gh release create *" # memory:no_public_release - "gh pr create *" - "fly deploy *" - "npm publish *" - "pnpm publish *" - "docker push *" mcp: # Specific MCP tool calls that need approval - "mcp__*__delete_*" - "mcp__github__merge_pull_request"
Tier 3 is enforced regardless. Even if you accidentally blank your blacklist.yml, the agent still cannot read ~/.ssh/id_rsa, run sudo, or pipe untrusted shell scripts. The hardcoded floor exists so a misconfiguration cannot become a security incident.
The system is a small society of specialized agents. Each has a different model, effort level, and tool budget. Cheap models do repetitive work. Opus 4.7 does the thinking. Vision-capable models do the seeing.
The Push Gate itself is implemented as a stdio MCP server (one per worker, with session and agent IDs baked into each config). Additional MCPs plug in the same way. Anything stdio-compatible works without code changes.
Single worker, stream-JSON parsing, SQLite persistence, basic HTTP API and SSE. Foundation everything else builds on.
Permission MCP, matcher with sensitive-path detection, parallel workers with concurrency cap, WS hub with topic subscriptions, dev-server detection, per-worker MCP config files.
Vite + React + Tailwind + shadcn UI with Cursor-style chrome, multi-tab agents, collapsible chat, live dev-server iframe, toast notifications, model-per-role settings.
Webhook receiver for CI failures and runtime errors, cron-style scheduled sessions, repo-specific priming (auto-load CLAUDE.md from any project root). FBM Sniper SRE use case.
Reviewer agent for code diffs. Vision-capable critic screenshots dev-server output, evaluates against intent, asks worker to iterate.
Git worktrees per worker so parallel agents can edit independent branches. Steward role coordinates merges and resolves conflicts.
Cloudflare Tunnel for dashboard access from anywhere. Cloudflare Access for auth. Discord bot for phone-driven operator input.
Always-on monitoring layer. Detects stuck workers, triages incoming errors, maintains durable context across sessions. The "SRE for your bot" layer.
Marketplace for community-built agent recipes. Self-improvement loop where the orchestrator can propose and ship changes to its own code (through the same gates).
| Capability | This | Cursor | Devin | OpenHands | Aider | Hermes | OpenClaw |
|---|---|---|---|---|---|---|---|
| Multi-agent parallel | Yes | No | No | Limited | No | No | Sub-agents |
| Granular Push Gate | Yes | Per-tool | No | No | Per-edit | No | Plugin-level |
| Bounded frozen-snapshot memory | Yes | No | No | No | No | Yes (origin) | State store |
| Per-repo memory scoping | Yes | Project rules | No | Workspace | CONVENTIONS.md | Global only | Plugin state |
| Procedural skill capture | Yes | No | No | No | No | Hand-authored | Plugins |
| Scheduled / cron execution | Planned | No | No | No | No | Yes | Yes |
| Multi-channel input (Discord, email, webhook) | Planned | No | Slack | No | No | Limited | Yes (origin) |
| Always-on monitoring (Watchdog) | Planned | No | No | No | No | Crons only | Crons only |
| Phone / remote operator input | Planned | Cloud | Web | No | No | No | Telegram |
| BYO Claude Max / Pro (post-Apr 2026) | Yes (subprocess) | Yes | API only | Workarounds | Workarounds | API only | Blocked |
| Local-first (your machine) | Yes | Hybrid | Cloud | Yes | Yes | Yes | Yes |
| Source available | Yes (planned) | No | No | Yes | Yes | Yes | Yes |
| Per-role model selection | Yes | Per-mode | No | Per-session | Yes | Per-skill | Yes |
"Origin" marks where an idea was pioneered. We borrow Hermes' frozen-snapshot memory and Skills design, and OpenClaw's multi-channel input and cron pattern, then extend each for the developer use case (per-repo scoping, multi-machine sync, official-CLI subprocess auth, push-gate safety).
The codebase is planned for AGPL-3.0 release. Community Edition runs single-machine with every role, the full Push Gate, and the dashboard. Pro Cloud is a paid tier focused on infrastructure the user cannot easily run themselves.
Moderator, all roles, multi-agent, Push Gate, dashboard, model-per-role configuration, full five-layer memory stack, single-machine operation. AGPL-3.0 license, source-available, modify freely.
Hosted Cloud Tunnel for phone access, hosted Discord bot, multi-machine memory sync, encrypted cloud backup of agent history, auto-updater, priority support. The infrastructure layer the Community Edition cannot replicate trivially.
In April 2026 Anthropic blocked Pro and Max OAuth tokens from working in third-party tools, breaking BYO-subscription auth for OpenClaw, custom harnesses, and most "Devin-but-mine" attempts. Those tools called the Anthropic API directly with the user's token, which Anthropic now refuses.
We do not call the API. We spawn the official claude CLI as a subprocess. The CLI is Anthropic's own client and retains full Pro / Max access. Our orchestrator never sees a token, never sends a request, never violates ToS. We just listen on the CLI's stdout and route its tool calls through the Push Gate.
Practical result: Claude Max users save real money. A Pro subscription that would cost hundreds in API equivalent stays at $20-200/mo flat. This is the single largest cost advantage we have over any post-April-2026 competitor that takes the API-key route.