Venture Studio · Guided Apprenticeships · Agentic Marketing Agency

The Log · Field Report · 13 min read

The Captain's Flywheel — module hero

The Captain's Flywheel Explained

I didn't theorize this framework and apply it. I built 15 versions of an AI agent named Gary, and the methodology emerged from the practice. The flywheel was lived before it was named.

You're probably here because you saw the Captain's Flywheel methodology page and want the story behind it. Fair. Most operating frameworks are theory first, practice second. This one is the opposite.

I built fifteen versions of an AI agent named Gary before I had a name for what I was doing. Each Gary was a rotation. Each rotation taught me something the previous one couldn't. By the time I was on Gary v15, the pattern had crystallized so thoroughly that I could see the same shape in every venture I was running. That shape became the Captain's Flywheel.

The before state, what didn't work

Before the framework, I was doing what most operators do: try the latest AI tool, hope it sticks, get frustrated when the magic wears off after three days.

I tried OpenClaw. I tried Hermes. Copilot. Gemini. Claude. GPT. Each one promised something. Each one delivered some of it. None of them changed how I actually ran a business. I'd get a useful output, then go back to my old habits, and a week later the AI advantage was gone.

The pattern: tools came and went, but my operating model stayed the same. I was treating AI like a fancy autocomplete instead of like a fundamental change in how work could be organized. The tools were ready. The captain wasn't.

Gary v1, the wobbliest rotation

I built my first Gary as an experiment. OpenClaw was the framework. Gary was my customized agent, a research and writing assistant pointed at one of my ventures. I wrote a Gary.md (the OpenClaw equivalent of CLAUDE.md), turned him on, and watched what happened.

It was a mess. Gary did things I didn't ask. Skipped things I did ask. Made up facts. Used the wrong tone. Saved files in the wrong places. And the lesson wasn't "Gary is bad." The lesson was "Gary's context is bad."

Gary v2 had a better Gary.md. Tighter rules, clearer scope, a few examples of good output. He was less wrong, more often. Still not great.

By v5, the structure of Gary.md was settling. Rules, examples, scope, a "heartbeat file" Gary read at the start of every session. By v8, Gary was producing actually-usable research drafts. By v12, he was running on a schedule and I trusted his output enough to skim instead of audit. By v15, Gary had become something close to what the open-source Command Kit is now, a fully-articulated theory of his job, expressed as a file structure, that an AI could read and execute against.

The aha, the framework was never the goal

Somewhere around Gary v10, I stopped trying to make Gary better and started watching what I was doing to make Gary better. Three layers were always in motion:

  • I was constantly editing Gary.md. Adding rules, removing vague language, sharpening scope. (This was Command, though I didn't call it that yet.)
  • Gary was running. Producing output. Failing in interesting ways. (Compute.)
  • I was reviewing his output every morning. Spotting failure modes. Updating Gary.md based on what I saw. (Cadence.)

The pattern was the framework. I'd been running a Captain's Flywheel for ten months without noticing. The reason Gary kept getting better wasn't because I was getting better at AI; it was because I was running a tight cadence on the relationship between us. Each rotation, my Command for Gary was sharper. Each rotation, Gary's Compute output was more useful. Each rotation, my Cadence review was faster.

Once I named it, I started seeing the same shape in every other thing I was running. HBOT Finder's 3,000+ pages? Captain's Flywheel, scaled. World Wellness Guide's content engine? Same shape. The agency work, the board game ops, the hospitality back-end, all of it was either spinning a flywheel or stuck because there wasn't one.

The Karpathy and Jensen connection

Around the same time, two things crystallized for me from outside.

Andrej Karpathy gave a talk in June 2025 at YC's AI Startup School called "Software Is Changing (Again)." His framing: Software 1.0 was code, Software 2.0 was neural network weights, Software 3.0 is prompting in English. LLMs are a new kind of computer, programmed in English. He named the shift and made it nameable for everyone else.

Then in March 2026, Jensen Huang at NVIDIA's GTC keynote said: "Every single company in the world today has to have an OpenClaw strategy. Just as we all needed a Linux strategy, an HTTP strategy, a mobile strategy, this is the new computing layer." He called OpenClaw "the operating system of agentic computers" and "the single most important release of software, probably ever."

Both of them were right. But they were describing what was happening at the tooling layer. The Captain's Flywheel describes what to do with the tooling at the operator layer. Karpathy and Jensen named the new computing primitives. The Captain's Flywheel is how a captain organizes themselves around those primitives so the work compounds instead of dispersing.

Software 3.0 says: programs are prompts now. The Captain's Flywheel says: prompts get great output only when they're written from a great Command, executed against well-chosen Compute, and reviewed in a Cadence that flows back into Command. Without the flywheel, Software 3.0 is just slightly better autocomplete. With the flywheel, it's a force multiplier.

The first time I ran the flywheel intentionally

HBOT Finder. Goal: scale to thousands of city pages in a niche where good content barely existed. Old approach would have been: hire a team, train them, set quality standards, manage SEO. Six-month build, expensive, brittle.

New approach (with the flywheel named):

  1. Command: Write a comprehensive research framework. What makes a city worth ranking for. What data we need. What page structure to produce. What SEO and AEO targets to hit. What past pages got wrong.
  2. Compute: Build a nightly Claude Code job that reads the framework, finds new cities, drafts pages. Cloudflare Workers handle deployment. Cron handles scheduling.
  3. Cadence: Each morning, review the new pages. Approve, request changes, decline. Update the research framework with what was wrong yesterday so today's job is sharper.

Three months in: 3,000+ pages live. The system was producing more than I could review, so I taught Cadence to surface only the marginal ones. Six months in: the framework was so refined that 95% of pages passed first review. The flywheel was spinning.

None of it required hiring a team. All of it required me to keep refining Command, choose Compute deliberately, and never skip Cadence.

Three things this gets you that other operating systems don't

  1. Compounding instead of consumption. Most "AI for founders" frameworks treat AI as a tool you consume. The flywheel treats it as a system you compound. Year-five is dramatically different from year-one.
  2. Captain's edge stays sharp. The framework is built around the captain's ownership. You don't outsource your reasoning. You amplify it. (Per the Captain and Agents doctrine.)
  3. Tool-agnostic. Cancel Claude tomorrow, switch to Gemini, the framework still works. The Command Kit is yours. The Cadence is yours. Compute swaps. Most "AI ops" frameworks bind you to a vendor.

Three valid critiques (that I'd raise myself)

  1. "It's just agile with AI." Partially true. The cadence structure mirrors agile sprints. But agile is about coordinating humans; 3C is about coordinating a captain with their AI crew. Different unit of work. Different leverage profile.
  2. "It works for solo operators, but does it scale?" Honest answer: I haven't run a 100-person org with this framework, so I don't know. My intuition is that it scales with the number of captains, not the number of employees. A 100-captain network operating with 100 flywheels is different from a 100-person company with one flywheel. We'll find out.
  3. "Naur was a programmer, not a business operator." True. Borrowing his "theory building" insight is a translation move, and translations leak. But the leak is small enough that the analogy holds, and the analogy is valuable enough that I'm willing to defend it.

What v2 might look like

The framework's current form is for solo or small-team operators with technical comfort. v2 probably extends in three directions:

  • A non-technical operator's path (the Command Kit without git, using Obsidian directly).
  • A team-of-captains pattern for small organizations where multiple people maintain interlocking flywheels.
  • An agent-coordination layer for when one captain runs 50+ agents and needs better orchestration without becoming an orchestrator.

The framework will get critiqued, refined, and updated as more captains run their own. That's the point. We treat MaxShip itself like a Captain's Flywheel, Command (this writing), Compute (the site, the community, the Skool calls), Cadence (your feedback, the Captain's Log). The next rotation of MaxShip is sharper than this one because of what you, captain, send back.

What to do after reading this

Three options:

Build your own Gary. Iterate. Don't expect v1 to work. By v15, you'll wonder how you ever ran a business without one.

References

  1. Software Is Changing (Again), Software 3.0 · Andrej Karpathy, YC AI Startup School · June 2025
  2. Software 3.0 transcript · Latent Space · June 2025
  3. NVIDIA GTC 2026 keynote, OpenClaw strategy · Jensen Huang, NVIDIA · March 2026
  4. Programming as Theory Building · Peter Naur · 1985
  5. How the Flywheel Killed HubSpot's Funnel · HubSpot Marketing Blog · 2018
  6. The Captain's Flywheel, methodology · MaxShip · 2026