• This is a really good framing honestly. Spec-first tools are useful, but milestone-first feels way better for actually learning and not getting lost after step 2.

    I like that you keep things verifiable at each step too, thats the part most AI coding tools skip and then people think theyre progressing when theyre just generating code.

    how do you decide milestone size so its not too tiny but also not overwhelming? do you track failed attempts/retries as part of progress, or only completed milestones? Also are you planning a mode where users can switch between learning path and spec mode in the same project? I think that could be very strong.

    • Thank you! Completely agree with you.

      Milestone size is basically tuned around one question: "can the user verify this in a short time without ambiguity?" If it is smaller than that, it becomes noise and hurts progress. If it is larger, people get stuck in a blob of work and lose the sense of forward motion. So the target is usually one meaningful capability change with one clear check - not a pile of subtasks and not a full feature.

      On progress, I think completed milestones should stay as the primary unit because that keeps the system honest. But retries and failed attempts still matter as signal and we don't track them. I would not count them as progress in the same sense, but I would track them as learning/debugging history, like where people got stuck, how many tries something took and whether a milestone needs to be split or clarified. That is useful both for the user and for improving the recipe indeed. Great point!

      And yes, I think mixing learning-path mode and spec mode inside the same project is important. A lot of real work starts milestone-first when you are learning the domain, then becomes spec-first once you know what you are building. So the model I like is one project with two views. Right now, Primer supports learner and builder tracks for the whole project/recipe. I would like to incorporate your feedback and allow switches between tracks.

  • How do you think Primer will evolve as agents get better at long-horizon tasks? The milestone and verification loop make sense now as agents fail unpredictably on complex tasks. Do tasks just get bigger and milestones scale with them?
  • Planning and specs are what makes AI capable and reliable
    • Agree. Right now, AI models are kind of at a plateau, so harnesses are really important.
  • Have you seen openspec, or any of the others in this space?
    • Yes. They are trying to make AI-assisted development more structured. I am focusing with Primer more on learning-path oriented side. It means breaking things into small, verifiable milestones that one completes step by step, rather than defining a full spec upfront.
      • That said I think there is definitely overlap and I am interested in borrowing ideas where it makes sense.
  • [dead]