Reflecting on the first 750 runs of the Ministry of Everything

MoE Inbox Zero Dash

After a few weeks building out the Ministry of Everything we’ve completed 750 runs with it, mostly on self-improvement. Now we’ve begun using it to build other things. Here are some of the lessons we learned along the way:

Good context is a huge win, and it’s not hard to maintain (with MoE) - The “anti-social” aspect of MoE is that it’s not really designed for a team of humans. It’s technically possible to use it that way, but the core idea is that a single operator runs a team of agents. The nice thing about this shape is that all the conversations, docs, and other context are in the repo and available to the agents. There’s no hallway stuff they aren’t privy to - it’s all text they can read. And the agents like it! By incorporating some simple skills the agents are able to search to find why things were or were not done, spot recurring problems, and to use the digital-twin documentation about a project’s architecture and patterns. The feedback mechanisms are well-used by the agents too and we made them richer along the way. Agents can (and do) log follow-up tasks, make notes to improve the digitial-twin, and capture lore that could be useful for any project. We saw this compounding pay off over these 750 runs and the agents definitely developed a sense of taste that matched ours without a lot of tedious reminding.
Careful work is cheap - We were able to work to near burnout levels (for the human) with our $200/month Claude Max 20X plan. We eventually added a $100/month Codex plan to try it out, integrate support for Codex, and have a backup so we could keep going when Claude was often down. We never had to burn any credits past our plan limits. We attribute this efficiency to the up front stage with the human working through a design, and then handing that off for coding. The agents can also reference the concise markdown docs about past changes rather than having to go back and read tons of code. This saved a lot of churn, dead ends, retries, and wasted effort by the agents, and the agents were nearly always able to one-shot the implemenatation.
Deleting things will set you free - When coding is basically free it’s very easy to just add something and see how it feels, but as soon as you find something isn’t useful, delete it. We had several issues where we were working on a bug, realized it was related to something that we didn’t really use, and we just asked the agent to completely remove the thing right there. If it was cheap to create it’s cheap to put back later if you change your mind; don’t carry junk around. The one case where we did put something back (ultimately moe chain edit and the chain facility) we ended up with a way better concept and implementation than our earlier attempt at a work queue. Removing it freed us up to reimagine it and we took advantage of the cheap work of the agents to enable that.

Overall we are pretty happy with where MoE has landed, and we are using it for everything at this point. Our daily rhythm is:

Gather ideas as we work (moe idea new <slug>).
Interactively work through designs with the agent and get them ready to hand off to code (moe sdlc new --from-idea=<slug>).
When a batch of designs are ready, chain them together and kick them off (moe chain edit and then moe sdlc code <first slug in the chain>).
Grab a coffee, check back later, and profit!