182 points by snide 4 days ago | 40 comments
  • This "screenshot -> refine loop" is a great strategy and I have built it into my 3D Modeling product as well[0], but had to disable it because it would often quadruple the costs and the product is already expensive.

    I am on standby to enable it though, just need a price to drop a bit more!

    [0]: https://grandpacad.com

    • My late maternal grandfather was Slovenian, so I enjoyed your project's backstory. I've mucked around with ChatGPT and OpenSCAD so can identify with that also. Great concept and best of luck!
    • Cool button
  • Really good. I’ve struggled with the same thing.

    > Instead of expecting it to understand my requests, I almost always build tooling first to give us a shared language to discuss the project.

    This is probably the key. I’ve found this to be true in general. Building simple tools that the model can use help frame the problem in a very useful way.

  • >I still occasionally hand write code in NeoVim on the bits I care the most about (CSS, design and early architecture like API patterns)

    I find it amazing how people's opinions differ here. This is the first stuff I'd trust to Claude and co. because it is very much in-distribution for training data. Now if I had sensitive backend code or a framework/language/library that is pretty new or updated frequently, I'd be much more cautious about trusting LLMs or at least I would want to understand every bit of the code.

    • I think OP nailed it with 'the bits I care the most about'—if you like those things a certain way, then you'll want to make sure they are that way, not accept whatever Claude does. If you don't care, you just want something done, then you'll have Claude do it while you work on what you do care more about.
      • That's true. Not everyone cares about shipping working software.
    • I think the main point is that LLMs are pretty good at following existing patterns and conventions.

      If you setup your skeleton in a way it is familiar to you, reviewing new features afterwards is easier.

      If you let the LLM start with the skeleton, they may use different patterns and in the long run it's harder to keep track of it.

      • > they may use different patterns

        "Bad" is the word you're looking for, not "different".

    • > in-distribution for training data

      Engineers are an opinionated bunch, safe to say at least a small chunk of us will disagree with what goes into the training pile.

      For me, it's preferring Deno-style pinned imports vs traditional require() or even non-versioned ecmascript import syntax.

  • I've been using Codex (GPT-5.4 extra high) to code custom FeatureScript in Onshape (3D mechanical CAD software). It's challenging to get it to do TDD that involves any visual reasoning. At the moment I've got tooling through Google Chrome Devtools MCP and Playwright to extract things and control the browser and I use some custom features which help with formatting and controlling debugging outputs (text and visual overlays). Mostly the text debugging outputs are very helpful to Codex. It will often add debugging payloads when we're focused on a particular issue. I do occasionally take screenshots and paste them into Codex and explain the issue that I'm seeing. It seems to understand a certain amount, especially if the issue can be seen in orthogonal views.
  • Just yesterday I used Claude to great effect in FreeCAD to model a church tower. The tower has a square base and an octagonal top, but connecting the two by creating a loft using the GUI in FreeCAD results in a wrong and ugly abomination.

    Claude understood the problem and produced elegant Python code that worked perfectly the first time.

    So I continued and described the other features of the tower to Claude, who coded them.

    It's sometimes difficult to properly describe what you want in English, and Claude does a lot of thinking, and sometimes goes deep into a wrong direction of which it won't come out easily; but in the end the result is almost perfect.

    • I'm not sure what went wrong exactly, but a common issue people run into with lofts is caused by the fact that FreeCAD tries to match vertices to vertices. If you use the split edge tool on your square to create 8 edges out of the existing 4 (maintaining the same square shape), it should loft smoothly to the octagon.

      If the loft appears twisted it only means you need to change the order in which you created the geometry. FC in addition to matching vertex to vertex will also match them in the order they were created. This allows you to purposely create twisted shapes, but sometimes can happen inadvertently also.

      • Here's a photo of the church tower in question:

        https://img.over-blog-kiwi.com/1/40/25/08/20170227/ob_76f390...

        You can see that the square base is connected to the octagonal tower using alternating trapezoids and triangles; FreeCAD doesn't seem capable of doing that using the GUI (but I'm no expert and would love to be proved wrong).

        • This is what the other person was trying to describe: https://imgur.com/a/J9lQBNK

          I chose 1mm for my corner chamfer on the base, but you could make it any dimension including something imperceptible.

          • Aah thank you!!!

            Yes that would have worked, I will try that next time. The last image is more or less what the Python code produced, but being able to do it in the GUI would be nice.

    • I've been struggling to get Claude or Gemini to make a project enclosure. I've got it down to a decent set of requirements and have been trying FreeCAD Python output, OpenSCAD, and Build123d. Most attempts fail to generate output (after several minutes of thinking and getting a good way there). When attempts don't fail they get aspects wrong. At this point I think my workflow needs to be breaking up the request into a series of functions for each feature then iterating on the overall model.
    • Can you describe how you hooked Claude up to freecad?

      I've messed around with freecad a bit (I'm still a beginner) and was just saying today I'd like to play around with trying to use llm's for 3d modelling.

      EDIT: I found this mcp[0] after searching, was it that? The docs mention Claude desktop, but I assume it works fine with Claude code too, right?

      [0] https://github.com/neka-nat/freecad-mcp

      • I didn't hook up anything to anything, I copied-and-pasted the Python output of Claude to the Python console of FreeCAD. Run, check, delete, repeat.
      • Hey, I'm the OP. I originally started with FreeCAD. There's not much to "hook up" to Claude. It can natively write for FreeCAD. You don't need to use the FreeCAD editor and can point to an external, local file with an import. At that point there's not much more than pointing your LLM to that file. You'll need to tell the FreeCAD desktop app to update on changes.

        Eventually I moved to JSCAD for the application mentioned in my blog post because I realized I wanted a more complex UI (which meant a web app) than what FreeCAD provided natively. If you're looking for something simple with some var statements though, FreeCAD might be enough.

        In my experience, the MCP isn't really needed. Claude at least already can write the code pretty well. The problems are more with getting it to understand the output, which the blog post covers.

        • Sorry what format of local file can it work with?

          I can see it working with scad, and then having that generate some things. I'd imagine it'd struggle with an STL file. I don't know much about the format of FCStd files but I'd find that surprising if it worked fine. Obviously three.js code and everything it could be alright with.

          It might be my lack of knowledge, because I've mostly just used Freecad to create and edit things and then just exported to STL (which doesn't feel like the thing Claude would be good at modifying)

    • [dead]
  • This is where I'm at too and why I wound up building my own ticketing system similar to Beads. I got tired of my personal frustrations (and errors from Beads when switching branches). I think its fine to build tools to help you be more productive with the model, it makes a lot of sense.
  • I've been having pretty good success with unity as a 3d llm tool. In addition to the iso views I've included a perspective mode that can focus on a list of game object ids with a custom camera origin. The agent is required to send instructions along with the VLM request each time in order to condition how the view is interpreted. E.g.: "How does ambient occlusion look in A vs B?".

    The VLM is invoked as a nested operation within a tool call, not as part of the same user-level context. This provides the ability to analyze a very large number of images without blowing token budgets.

    I've observed that GPT5.4 can iteratively position the perspective camera and stop once it reaches subjectively interesting arrangements. I don't know how to quantify this, but it does seem to have some sense of world space.

    I think much of it comes down to conditioning the vision model to "see" correctly, and willingness to iterate many times.

  • Great article. I've been trying to achieve something similar with a Revit. It's an old CAD application for Windows which means there's a few additional hurdles in exposing a cli interface that allows the LLM to drive it. However, once that is done, the loop of "write code, take a screenshot, repeat" works pretty well.
    • I’d be interested if you can describe a bit more of this process, and what kind of modelling in revit that works well with this.
  • My mind-blown moment was when I was doing work like this and Claude started positioning the camera itself to get better looks at areas it wanted to improve.
  • I tried to generate python scripts for Blender with local models a while back and the results were pretty bad. I assume the frontier models of today would fare much better.
  • gemini on the otherhand, isnt half bad.

    all i wanted was some opinions on if my bad idea would work, but it instead wrote me files for making my own sony earphones in 3ish parts.

    and when i sewed it together, it worked!

    that said, it did have full access to a mini CAD app, but i think it wrote all its own calculations inline

    • Gemini's best ability is it's 3d spatial reasoning. It's downright terrible at a lot of things (toolcalling is an absolute nightmare), but it consistently wins in stuff like 3d modeling, reasoning through 3d problems, and even 2d layout and animation tasks like the infamous pelican riding a bycicle benchmark
  • Honestly understanding and applying 3d transformations should be a new LLM benchmark. Three.js, OpenSCAD, even Nano Banano prompts. The moment you add that extra dimension any semblance of ‘intelligence’ goes right out the window. Every model out there seems to spin themselves in circles trying to logic through it with no success.
    • As soon as LLMs are doing serious junior level 3D modeling and mechanical CAD design, that's going to lead to some wild iteration loops with rapid prototyping. Very exciting.
  • I thought about something similar with claude, I would like it to operate as an assistant for in something like unity engine.
  • For people wondering before they click, this is about 3D CAD / 3D printing, not 3D animation.
  • Got some excellent results vibe coding 3.js games with Claude. Maybe for printable things it is harder as precision is important though.
    • Try asking it for some curves that go in this or that direction. CW, CCW, curves to the left, to the right, etc.

      Produces a bunch of crap and it’s really hard to convince it that something needs to be changed. “Yep, I looked at the screenshot, it curves, so it’s okay.”

  • Claude is terrible. I've been using Codex for a few months and decided to give Opus a try and see how it is.

    After asking it to review a single file in a simple platformer game, it goes:

    > Coyote jump fires in the wrong direction (falling UP with inverted gravity)

        var fallVelocity: float = body.velocity.y * body.up_direction.y
    
    I'm like ok, suggest a fix

    > I owe you a correction: after re-analyzing the math more carefully, lines 217–223 are actually correct — my original review point was wrong. Let me walk through why.

    Oh boy. It's had several other gaffes like this, and the UI/UX is still crap (fonts don't get applied, it doesn't catch up with the updated working state after editing files etc.) Codex helped me save time but Claude is just wasting my time. Can I get a refund?

    • Do you find Gemini or ChatGPT or Grok to be better?

      I am currently using Claude as I find it to be better than the others at the free tier.

      • Around 8 months or so of using ChatGPT/Codex, Claude, Grok & Gemini, I've found ChatGPT and Grok to be the best overall.

        Claude often gets API and logic wrong and tries way too hard to "impress": When asked to generate Godot proof-of-concept scenes (or other stuff) it adds a lot of extra textures, colors, UI etc whereas Codex is more exact, does literally what asked, no more no less. Claude tends to flops in general non-coding questions too.

        Gemini is the worst; it often refuses to search or interop with Google's own services like Flights etc. When I asked to do a reverse image search, it told me to go to TinEye

        ChatGPT/Sora seems to be better at image generation than Gemini/Banana too.

        Sometimes I ask them a quirky question or a random phrase and only Grok and ChatGPT seem to know what's up. Gemini would if it used Google's own search but nope.

  • [dead]
  • [dead]
  • [dead]
  • [dead]
  • [dead]
  • [dead]
  • [flagged]