Most failures in agent-driven development come from brittle edit tools and brittle orchestration, not the base LLMs. This project treats the harness as the product: it focuses on deterministic edits, scoped context, and category-driven model routing so agents actually finish work instead of producing noisy diffs or stale-line edits.
What Sets It Apart
- Hash-anchored edit tooling: every line is tagged with a content hash and edits reference those tags, so attempted changes fail fast if the workspace drifted. So what: reduces merge corruption and the common ‘stale-line’ failure mode when models apply edits.
- Multi-agent discipline with model categories: an orchestrator (Sisyphus) delegates to specialists (Hephaestus, Prometheus, etc.) mapped to model categories rather than fixed providers. So what: you get parallel, role-specific agents (planner, deep worker, quick fixer) while keeping provider-agnostic fallbacks.
- LSP + AST-aware rewrites and session tools: IDE-grade refactors, AST-grep rewriting, and TMUX-backed interactive sessions for agents. So what: agent changes are surgical and verifiable, not blunt token-driven substitutions.
- Built-in MCPs and skill-embedded servers: scoped search/docs/GitHub connectors per skill to keep context windows small. So what: better retrieval, fewer hallucinations, and lighter context budgets during long-running tasks.
Who It's For
Great fit if you want to hand repetitive engineering work to autonomous agents and keep control over correctness: large repositories that benefit from atomic, hash-verified edits; teams adopting multi-model setups (mixing hosted and local models); or developers who want an agent-driven workflow integrated with LSP and git. Look elsewhere if you need a lightweight single-model CLI with no external model subscriptions, or if strict offline-only environments are required—some workflows assume access to higher-quality/paid models and scoped MCPs.
Where It Fits
Acts as a middle layer between developer workflows and LLM providers: not a new model, but a harness that manages models, edit safety, and agent roles. It complements local-model toolchains and can be used alongside other harnesses that lack deterministic edit tooling.
Practical trade-offs
Adopting it reduces common agent integration failures but introduces operational choices: configuring model fallbacks, managing token costs for high-throughput runs (ultrawork can spin many parallel agents), and auditing agent permissions for destructive repo operations. The maintainers surface these trade-offs in tooling (doctor checks, config overrides) rather than hiding them.
