• Not sure if “code has always been expensive” is the right framing.

    Typing out a few hundred lines of code was never the real bottleneck. What was expensive was everything around it: making it correct, making it maintainable (often underestimated), coordinating across teams and supporting it long term.

    You can also overshoot: Testing every possible path, validating across every platform, or routing every change through layers of organizational approval can multiply costs quickly. At some point, process (not code) becomes the dominant expense.

    What LLMs clearly reduce is the short-term cost of producing working code. That part is dramatically cheaper.

    The long-term effect is less clear. If we generate more code, faster, does that reduce cost or just increase the surface area we need to maintain, test, secure, and reason about later?

    Historically, most of software’s cost has lived in maintenance and coordination, not in keystrokes. It will take real longitudinal data to see whether LLMs meaningfully change that, or just shift where the cost shows up.

    • "What was expensive was everything around it" - when I say that code has always been expensive that's part of what I'm factoring in.

      But even typing that first few hundred lines used to have a much more significant cost attached.

      I just pasted 256 lines of JavaScript into the 2000s-era SLOCount tool (classic Perl, I have a WebAssembly hosted version here https://tools.simonwillison.net/sloccount) and it gave me a 2000s-era cost estimate of $6,461.

      I wouldn't take that number with anything less than a giant fist of salt, but there you have it.

      • > when I say that code has always been expensive that's part of what I'm factoring in.

        Fair, but when an LLM writes code in response to a prompt I really don't get the sense that it's doing as much of that "everything around" part as you might expect.

        • Yeah it absolute isn't, but the time it's saving you means you can spend more effort on all of that stuff.
      • trgn
        people became millionaires writing html 30 years ago. there's been these shifts in the past.
        • People became billionaires from domain names in the dot com era.
          • haha! do you know details, what domains?
            • Mark Cuban sold broadcast.com to Yahoo for $5.7bn

              It was a bit more than a domain name - they had 330 employees and $13.5 million in revenue for a quarter - but that acquisition was definitely peak dot-com boom.

              • thanks!

                i would love another bubble. i feel like tech has been in a corner for going on ten years now (covid spike was so brief). it's so concentrated in ai it sucks up everything.

      • [dead]
    • > The long-term effect is less clear. If we generate more code, faster, does that reduce cost or just increase the surface area we need to maintain, test, secure, and reason about later?

      My take is that the focus is mostly oriented towards code, but in my experience everything around code got cheaper too. In my particular case, I do coding, I do DevOps, I do second level support, I do data analysis. Every single task I have to do is now seriously augmented by AI.

      In my last performance review, my manager was actually surprised when I told him that I am now more a manager of my own work than actually doing the work.

      This also means my productivity is now probably around 2.5x what it was a couple of years ago.

      • > In my last performance review, my manager was actually surprised when I told him that I am now more a manager of my own work than actually doing the work.

        I think this is very telling. Unless you have a good manager who is paying attention, a lot of them are clueless and just see the hype of 10x ing your developers and don't care about the nuance of (as they say) all the surrounding bits to writing code. And unfortunately, they just repeat this to the people above them, who also read the hype and just see $$ of reducing headcount. (sorry, venting a little)

        • He definitely was paying attention.

          He had to pause for a second there, arrested by the realization, and was one of the reasons I got an "Exceeds expectations" in one of my KRAs.

          • It is interesting though that he evidently didn't notice this 2.5X productivity increase until you pointed it out to him.
      • This has been my experience, too. In dealing with hardware, I'm particularly pleased with how vision models are shaping up; it's able to identify what I've photographed, put it in a simple text list, and link me to appropriate datasheets. yday, it even figured out how I wanted to reverse engineer a remote display board for a just-released inverter and correctly identified which pin of which unfamiliar Chinese chip was spitting out the serial data I was interested in; all I actually asked for was chip IDs with a quick vague note on what I was doing. It doesn't help me solder faster, but it gets me to soldering faster.

        A bit OT, but I would love to see some different methods of calculating economic productivity. After looking into how BLS calculates software productivity, I quit giving weight to the number altogether and it left me feeling a bit blue; they apply a deflator in part by considering the value of features (which they claim to be able to estimate by comparing feature sets and prices in a select basket of items of a category, applying coefficients based on differences); it'll likely never actually capture what's going on in AI unless Adobe decides to add a hundred new buttons "because it's so quick and easy to do." Their methodology requires ignoring FOSS (except for certain corporate own-account cases), too; if everyone switched from Microsoft365 to LibreOffice, US productivity as measured by BLS would crash.

        BLS lays methodology out in a FAQ page on "Hedonic Quality Adjustment"[1], which covers hardware instead of software, but software becomes more reliant on these "what does the consumer pay" guesses at value (what is the value of S-Video input on your TV? significantly more than supporting picture-in-picture, at least in 2020).

        [1] https://www.bls.gov/cpi/quality-adjustment/questions-and-ans...

    • Honestly, if I would just get what the end user “really wants” (they often don’t even know) that would save a huge percentage of the overall cost, and that’s not code that’s human nature.
      • That is also getting cheaper, you can now quickly present him with a few working prototypes so he can quickly make up his mind what best suites him.

        Another problem is that most users want different things, that's why you get these big bloated software suites. With LLMs it now also becomes more achievable to build custom software per user.

        • I find it is more a rubber hits the road thing where they have to use it to really understand.
    • code was expensive, is and will be expensive. the real cost is hidden. takes a mature eye to see a codebase that works and is not a dumpster fire.

      correctness (doing what its supposed to, nothing else), maintainability (accommodating unknown future changes), cost ( deployment, refactoring, integrations) and performance (making the right tradeoffs) are not obvious, don't come naturally till you burn your fingers and differentiate a good from a horrible end result.

    • I don’t understand why you would agree with the entire article yet frame it as disagreement.
  • > Code has always been expensive. Producing a few hundred lines of clean, tested code takes most software developers a full day or more. Many of our engineering habits, at both the macro and micro level, are built around this core constraint.

    > ...

    > Writing good code remains significantly more expensive

    I think this is a bad argument. Code was expensive because you were trying to write the expensive good code in the first place.

    When you drop your standards, then writing generated code is quick, easy and cheap. Unless you're willing to change your standard, getting it back to "good code" is still an equivalent effort.

    There are alternative ways to define the argument for agentic coding, this is just a really really bad argument to kick it off.

    • bfbf
      In my experience, it’s even more effort to get good code with an agent-when writing by hand, I fully understand the rationale for each line I write. With ai, I have to assess every clause and think about why it’s there. Even when code reviewing juniors, there’s a level of trust that they had a reason for including each line (assuming they’re not using ai too for a moment); that’s not at all my experience with Codex.

      Last month I did the majority of my work through an agent, and while I did review its work, I’m now finding edge cases and bugs of the kind that I’d never have expected a human to introduce. Obviously it’s on me to better review its output, but the perceived gains of just throwing a quick bug ticket at the ai quickly disappear when you want to have a scalable project.

      • There is demand for non scalable, not committed to be maintained code where smaller issues can tolerated. This demand is currently underserved as coding is somewhat expensive and focused on critical functions.
        • What are some examples of when buggy code can be tolerated?
          • You are setting up to say "I wouldn't tolerate that" for any example given, but if you look at the market and what makes people actually leave, instead of what makes people complain, then basically anything that isn't life-and-death, safety critical, big-money-losing, or data corrupting is tolerable. There's plenty of complaints about Microsoft, Apple, Gmail, Android, and all kinds of 3rd party niche business systems.

            All the decades people tolerated blue-screens on Windows. All the software which regularly segfaulted years ago. The permeation of "have you tried turning it off and on again" into everyday life. The "ship sooner, patch later" culture. The refusal to use garbage collected or memory managed languages or formal verification over C/C++/etc because some bugs are more tolerable than the cost/effort/performance costs to change. Display and formatting bugs, e.g. glitches in video games. When error conditions aren't handled - code that crashes if you enter blank parameters. Bugs in utility code that doesn't run often like the installer.

            One software I installed yesterday told me to disable some Windows services before the install, then the installer tried to start the services at the end of the install and couldn't, so it failed and finished without finishing installing everything. This reminded me that I knew about that, because that buggy behaviour has been there for years and I've tripped over it before; at least two major versions.

            Another one I regularly update tells me to close its running processes before proceding with the install, but after it's got to that state, it won't let me procede and it has no way to refresh or rescan to detect the running process has finished. That's been there for years and several major versions as well.

            One more famous example is """I'm not a real programmer. I throw together things until it works then I move on. The real programmers will say "Yeah it works but you’re leaking memory everywhere. Perhaps we should fix that." I’ll just restart Apache every 10 requests.""" - Rasmus Lerdorf, creator of PHP. I've a feeling that was admitted about 37 Signals and Basecamp, it was common to restart Ruby-on-Rails code frequently, but I can't find a source to back that up.

          • From recent personal examples

            We have a somewhat complicated OpenSearch reindexing logic and we had some issue where it happened more regularly than it should. I vibecoded a dashboard visualizing in a graph exactly which index gets reindexed when and into what. Code works, a little rough around the edges. But it serves the purpose and saved me a ton of time

            Another example, in an internal project we made a recent change where we need to send specific headers depending on the environment. Mostly GET endpoint where my workflow is checking the API through browser. The list of headers is long, but predetermined. I vibecoded an extension that lets you pick the header and allows me to work with my regular workflow, rather than Postman or cURL or whatever. A little buggy UI, but good enough. The whole team uses it

            I'm not a frontend developer and either of these would take me a lot of time to do by hand

          • If the code is being used by a small group of people who are willing to figure out and share workarounds for those bugs - internal staff, for example.
            • Aren’t you also paying internal staff for their time. Waisting their time is waisting your money.
              • I've been in these situations before. If there's a known bug in an internal tool that would take the development team a day to investigate and fix - aka $10,000s - it's often smarter to send around an email saying "don't click the Froople button more than once, and if you do tell Benjamin and he'll fix it in the database for you".

                Of course LLMs change that equation now because the fix might take a few minutes instead.

                • > If there's a known bug in an internal tool that would take the development team a day to investigate and fix - aka $10,000s - it's often smarter to send around an email saying "don't click the Froople button more than once, and if you do tell Benjamin and he'll fix it in the database for you".

                  How much will Benjamin's time responding to those calls cost in the long run?

                  • Hopefully none, because your staff will read the email and not click the button more than once.

                    Or one of them will do it, Benjamin will glare at them and they'll learn not to do it again and warn their coworkers about it.

                    Or... Benjamin will spend a ton of time on this and use that to successfully argue for the bug to get fixed.

                    (Or your organization is dysfunctional and ends up wasting a ton of money on time that could have been saved if the development team had fixed the bug.)

                • > development team a day to investigate and fix - aka $10,000s

                  What about the non-fictional 99.999999999% of the world that doesn't make $1000/hour?

                  • Large companies are often very bad at organizing work, to the tune of increasing the cost of everything by a large multiple over what you'd think it should be. Most of that cost wouldn't be productive developer time.
                  • It costs them single digit thousands instead.
              • The alternative is the staff having no software at all to help with their task which wastes even more of their time.
          • Points at the public sector
      • You need to have the AI write an increasingly detailed design and plan about what to code, assess the plan and revise it incrementally, then have it write code as planned and assess the code. You're essentially guiding the "Thinking" the AI would have to perform anyway. Yes, it takes more time and effort (though you could stop at a high-level plan and still do better than not planning at all), but it's way better than one-shotted vibe code.
        • The problem is those plans become huge. Now I have to review a huge plan and the comparatively short code change.
          • It shouldn't be any longer than the actual code, just have it write "easy pseudocode" and it's still something that you can audit and have it translate into actual coding.
        • TDD is a great way to start the plan, stubbing things it needs to achieve with E2E tests being the most important. You still need to read through them so it won't cheat, but the codebase will be much better off with them than without them.
        • This works but still lacks most context around previous tasks and it isn’t trivial to get it to take that into account.
      • > Even when code reviewing juniors, there’s a level of trust that they had a reason for including each line (assuming they’re not using ai too for a moment)

        Even my seniors are just copy pasting out whatever Claude says. People are naturally lazy, even if they know what they’re doing they don’t want to expend the effort.

      • I hear you, but it seems quicker to predict whether the agent's solution is correct/sound before running it than to compose and "start" coding yourself. Understanding something that's already there seems like less effort. But I guess it highly depends on what you are doing and its level of complexity and how much you're offloading your authority and judgment.
        • I find it amazing that skills are essentially excellent tools for humans to understand too.
          • I really wish they were called lessons instead of skills. It makes way more sense and prevents the overloading of the term "skill".
    • I was careful to say "Good code still has a cost" and "delivering good code remains significantly more expensive than [free]" rather than the more aesthetically pleasing "Good code is expensive.

      I chose this words because I don't think good code is nearly as expensive with coding agents as it was without them.

      You still have to actively work to get good code, but it takes so much less time when you have a coding agent who can do the fine-grained edits on your behalf.

      I firmly believe that agentic engineering should produce better code. If you are moving faster but getting worse results it's worth stopping and examining if there are processes you could fix.

      • Totally agreed. I’ve been reverse engineering Altium’s file format to enable agents to vibe-engineer electronics and though I’m on my third from scratch rewrite in as many weeks, each iteration improves significantly in quality as the previous version helps me to explore the problem space and instruct the agent on how to do red/green development [1]. Each iteration is tens of thousands of lines of code which would have been impossible to write so fast before so it’s been quite a change in perspective, treating so much code as throw away experimentation.

        I’m using a combination of 100s of megabytes of Ghidra decompiled delphi DLLs and millions of lines of decompiled C# code to do this reverse engineering. I can’t imagine even trying such a large project for LLMs so while a good implementation is still taking a lot of time, it’s definitely a lot cheaper than before.

        [1] I saw your red/green TDD article/book chapter and I don’t think you go far enough. Since we have agents, you can generalize red/green development to a lot of things that would be impractical to implement in tests. For example I have agents analyze binary diffs of the file format to figure out where my implementation is incorrect without being bogged down by irrelevant details like the order or encoding of parameters. This guides the agent loop instead of tests.

      • > I was careful to say "Good code still has a cost" and "delivering good code remains significantly more expensive than [free]" rather than the more aesthetically pleasing "Good code is expensive.

        Which is nuance that will get overlooked or waved away by upper management who see the cost of hiring developers, know that developers "write code", and can compare the developer salary with a Claude/Codex/whatever subscription. If the correction comes, it will be late and at the expense of rank and file, as usual. (And don't be naive: if an LLM subscription can let you employ fewer developers, that subscription plus offshore developers will enable even more cost saving. The name of the game is cost saving, and has been for a long time.)

        • Value of Claude subscription: $0

          Value of developer + Claude subscription: N * value of developer without Claude subscription where N is still the subject of intense debate.

        • Sure, but clueless leadership is not a new thing. While big companies with structural moats can shamble along with a surprising amount of dysfunction (which is why they tolerate so many muppets in middle management), even they rely on some baseline of system integrity that will erode pretty quickly if they let go of the people who know how things work.

          Don’t get me wrong, I think SWE headcounts will reduce over time, but the mechanism will be teams that know how to leverage AI effectively will dominate ones who don’t. This takes more market cycles though, and it’s even hard to nail down the specifics of these skills with the speed agentic coding tools are currently evolving. My advice is make yourself part of the second group, and worry less about bad management decisions that are inevitable.

      • > I was careful to say "Good code still has a cost" ...

        Misleading headline, with the qualifier buried six paragraphs deep. You have a wide enough readership (and well deserved too). Clickbait tactics feel a little out of place on your blog.

        • This is the chapter title for a sort-of book I'm working on, and it's the central philosophy I'm building the book around.

          I'm not going to change a good chapter title (and I do think it's a good chapter title) just because people on Hacker News won't read a few paragraphs of content.

          A dishonest title would be "Code is cheap now" or "Programming is cheap now". I picked "Writing code is cheap now" to capture that specifically the bit where you type code into a computer is the thing that's cheap.

          • Fairly esoteric and self-serving definition of "writing code" if it represents just the typing part. I wouldn't call it a dishonest title, but perhaps not a fully honest one either.
      • > I chose this words because I don't think good code is nearly as expensive with coding agents as it was without them.

        Still navigating this territory, but I think a lot of people are getting caught up on the idea that producing code is simply a matter of typing it at the keyboard.

        One of the benefits of something like Claude Code isn't just the code it produces, but the ability to quickly try out ideas, get some feedback, AND THEN write the good code.

        > than the more aesthetically pleasing

        Agreed. What even is "good" code? So much of the bad code I write isn't necessarily that it's ugly, it's bad because it misses the mark. Because I made too many assumptions and didn't take the time to actual learn the domain. If I can eek out even a few more hours a week to actually build worthwhile solutions because I was able to focus a bit more, it's a win to me. My users in particular have a really difficult time imagining features without actually seeing them. They have a hard to articulating what's wrong/right without something tangible in front of them. It would be hard to argue that having the ability to quickly prototype and demo features to people is a bad thing.

        • This was the single worthwhile point behind “agile” development: getting new code in front of users as quickly as possible to know whether or not you’re building the right thing.

          With agile that meant delivering something to evaluate every two weeks instead of 6 months or a year. Now with AIs maybe it should be a new version every day? Are current processes outside of writing the code capable of supporting that cadence? Do users even want to try new versions that often?

    • I really like the idea of Ousterhout's tactical vs strategic programming. Where we either create a new feature as fast as possible vs focusing on architecture and keeping complexity in check.

      I truly believe that LLMs are replacing tactical programming. Focusing on implementing features as fast as possible with not much regards to the overall complexity of a system.

      Its more important then ever to focus on keeping complexity low at a system level.

    • Code is cheaper. Simple code is cheap. More complex code may not be cheaper.

      The reason you pay attention to details is because complexity compounds and the cheapest cleanup is when you write something, not when it breaks.

      This last part is still not fully fleshed out.

      For now. Is there any reason to not expect things to improve further?

      Regardless, a lot of code is cheap now and building products is fun regardless, but I doubt this will translate into more than very short-term benefits. When you lower the bar you get 10x more stuff, 10x more noise, etc. You lower it more you get 100x and so on.

    • Yeah, I have to agree. Worth noting that deterministic "code-generation" systems have existed for quite some time. It's just that once you saw something like that being useful for your system, that gave pause to critique the existing system design.

      Boilerplate is boilerplate, whether filling it in is purely mechanical or benefits from an LLM's fuzzy logic.

    • Computer programming is cheap. Software engineering is expensive.
    • The problem is to know what to write. I spent a whole day yesterday to understand what the problem was with a PDF file we were failing to process. Once I understood the cause it took one line of code to fix it and another dozen of lines for a unit test. A LLM helped me to write the code to explore several possible candidate problems but it did not find the problem by itself.

      So code is both cheaper (what the LLM wrote for me much faster than I could have typed it) and is also expensive (the only line that we deployed to production today.)

    • ap99
      Code has a generation cost and a maintenance cost.

      If you just look at generation then sure it's super cheap now.

      If you look at maintenance, it's still expensive.

      You can of course use AI to maintain code, but the more of it there the more unwieldy it gets to maintain it even with the best models and harnesses.

      • Once writing code is cheap you don't maintain code. You regenerate it from scratch.

        What you maintain is the specification harness, and change that to change the code.

        We have to start thinking at a higher level, and see code generation in the same way we currently see compilation.

        • Tokens aren’t free.

          Far more expensive than compilation and non deterministic so you’re not sure if you will get the same software if you give the AI the same spec.

        • I'm not sold on that idea yet.

          I don't just have LLMs spit out code. I have them spit out code and then I try that code out myself - sometimes via reviewing it and automated tests, sometimes just by using it and confirming it does the right thing.

          That upgrades the code to a status of generated and verified. That's a lot more valuable than code that's just generated but hasn't been verified.

          If I throw it all away every time I want to make a change I'm also discarding that valuable verification work. I'd rather keep code that I know works!

        • Unless the specification is also free of bugs and side effects, there is no guarantee that a rewrite would have fewer bugs.

          Plenty of rewrites out there prove that point.

      • I 'love' that folks are seemingly inching slowly towards more acceptance of crappy llm code. Because it costs marginally less to produce to production if you just pass some smoke tests? Have we not learned anything about technical debt and how it bites back hard? Its not even seniority question, rather just sane rational approach to our craft unless one wants to jump companies every few months like a toxic useless apple (not sure who would hire such person but world is big and managers often clueless).

        There are of course various use cases, for few this is an acceptable tradeoff but most software ain't written once and never touched (significantly) again, in contrary.

        • > Have we not learned anything about technical debt and how it bites back hard?

          I think LLMs are changing the nature of technical debt in weird ways, with trends that are hard to predict.

          I've found LLMs surprisingly useful in 'research mode', taking an old and badly-documented codebase and answering questions like "where does this variable come from, and what are its ultimate consumers?" Its answers won't be as natural as a true expert's, but its answers are nonetheless useful. Poor documentation is a classic example of technical debt, and LLMs make it easier to manage.

          They're also useful at making quick-and-dirty code more robust. I'm as guilty as anyone else of writing personal-use bash scripts that make all kinds of unjustified assumptions and accrete features haphazardly, but even in "chat mode" LLMs are capable of reasonable rewrites for these small problems.

          More systematically, we also see now-routine examples of LLMs being useful at code de-obfuscation and even decompilation. These forward processes maximize technical debt compared to the original systems, yet LLMs can still extract meaning.

          Of course, we're not now immune to technical debt. Vibe coding will have its own hard-to-manage technical debt, but I'm not quite sure that we have the countours well defined. Anecdotally, LLMs seem to have their biggest problem in the design space, missing the forest of architecture for the trees of implementation such that they don't make the conceptual cuts between units in the best place. I would not be so confident as to call this problem inherent or structural rather than transitory.

          • None of what you describe is free.

            After the LLM helps untangle the mess, if you leave the mess in place, you will have to ask the LLM untangle it for you every time you need to make a change.

            Better to work with the LLM to untangle the technical debt then and there and commit the changes, so neither you nor the LLM have to work so hard in the future.

            I’ve even seen anecdotal evidence that code that’s easier for humans to work with is easier for LLMs to work with as well.

          • > taking an old and badly-documented codebase and answering questions like "where does this variable come from, and what are its ultimate consumers?"

            Why do you even need an LLM for this? Code is formal notation, it’s not magic. Unless the code is obfuscated, even bad code is pretty clean on what they’re doing and how various symbols are created and used. What is not clear is “why” and the answer is often a business or a technical decision.

            • > Why do you even need an LLM for this?

              Once you get above a few hundred thousand lines of legacy undocumented code having a good LLM to help dig through it is really useful.

        • The inching-towards-acceptance of crappy processes is quite influencer-driven as well, with said influencers if not directly incentivised by LLM providers, then at least indirectly incentivised by the popularity of outrageous exhortations.

          There's definitely a chunk of the developer population that's not going to trade the high-craft aspects of the process for output-goes-brrr. A Faustian bargain if ever I saw one. If some are satisfied by what comes down to vibe-testing and vibe-testing, I guess we wish them well from afar.

        • I wouldn't say acceptance of crappy code. I think the issue is the acceptance of LLM plans with just a glance and the acceptance of code without any code review by the author at all because if the author would waste any more time it wouldn't be worth it anymore.
        • People aren't interested in long-term thinking when companies are doing layoffs for bullshit reasons and making vague threats about how most of us will have to go find a new career which causes a heck of a lot of stress and financial costs. That isn't being petty, it's having self-respect. They get the quality when the companies treat the craftspeople with respect.
    • I think the cost and work remains the same. What has change is efficiency. Previously people had to manually program byte after byte. Then came C and streamlined it, allowing faster development.

      With python I can write a simple debugging UI server with a few lines.

      There are frameworks that allow me to complete certain tasks in hours.

      You do not need to program everything from scratch.

      The more code, the faster everything gets, since the job is mostly done.

      We are accelerating, but we still work 9 to 5 jobs.

      • > I think the cost and work remains the same. What has change is efficiency. Previously people had to manually program byte after byte. Then came C and streamlined it, allowing faster development.

        I think you got your history wrong. People didn’t program bit by bit. They programmed on paper (flowcharts, pseudo-code, diagrams,…), then encoded that afterwards. There was a lot of programming languages before C like Lisp and APL (which are high-level, btw). Why would they waste precious computer time, when you could plan out procedures on a notepad or a whiteboard.

      • C, Python, and frameworks don't generate all-new code for every task: you're taking advantage of stuff that's thoroughly tested. That simple debugging UI server is probably using some well-tested libraries, which you can reasonably trust to be bug-free (and which can be updated later to fix any bugs, without breaking your code that relies on them). With AI-generated code, this isn't the case.
        • Depends on what you think with AI-generated code. Do you mean vibe-coded? If yes, then I agree, but there are also other scenarios of AI-generated code.

          I use regularly AI in my hobby projects, it provides me feedback, proposed other libraries to use, or other solutions. It generates some classes for which I write tests. I also need to understand the code it generates. If I don't I don't use it. It does speed up my process of creation of code.

          If other people also are accelerated lets say 30% then, everything is sped up, cheaper. I think many people use it AI like that. It is just a tool, like a hammer, with which you can harm yourself if you do not know how to use it.

    • Spaghetti code was always a thing though
      • Yeah people going on and on about peerless human generated code reminds me of that scene in iRobot when the cop asks sonny “can a robot create a work of art?” And the robot replies “can you?”. In 30 years in this industry I’ve yet seen what I would describe as perfect and maintainable code. It’s an aspirational dream not SOP.
    • > When you drop your standards, then writing generated code is quick, easy and cheap.

      Not as cheap as generating code of equivalent quality with an LLM.

    • Definitely the market incentives for "good code" have never been worse, but I'm wouldn't be so sure the cost of migrating decent pieces of generated code to good code is worse than writing good code from whole cloth.
      • I find that implementing a sound solution from scratch is generally lower effort than taking something that already exists and making it sound.

        The former: 1) understand the problem, 2) solve the problem.

        The latter: 1) understand the problem, 2) solve the problem, 3) understand how somebody or something else understood & solved the problem, 4) diff those two, 5) plan a transition from that solution to this solution, 6) implement that transition (ideally without unplanned downtime and/or catastrophic loss of data).

        This is also why I’m not a fan of code reviews. Code review is basically steps 1–4 from the second approach, plus having to verbally explain the diff, every time.

        • > This is also why I’m not a fan of code reviews.

          That's specious reasoning. Code reviews are a safeguard against cowboy coding, and a tool to enforce shared code ownership. You might believe you know better than most of your team members, but odds are a fresh pair of eyes can easily catch issues you snuck in your code that you couldn't catch due to things like PR tunnel vision.

          And if your PR is sound, you certainly don't have a problem explaining what you did and why you did it.

          • Code reviews have their place. I just personally don’t like being the reviewer, because it’s more effort on your part than just writing the damn thing from scratch while someone else gets the credit for the result[0]. Of course, having multiple pairs of eyes on the code and multiple people who understand it is crucial.

            [0] Reviews are OK if I enjoy working with the person whose work I’m reviewing and I feel like I’m helping them grow.

  • > Code has always been expensive. Producing a few hundred lines of clean, tested code takes most software developers a full day or more. Many of our engineering habits, at both the macro and micro level, are built around this core constraint.

    Wasn't writing code always cheap? I see this more like a strawmen argument. What is clean code? Tested code? Should each execution path of a function be tested with each possible input?

    I think writing tests is important but you can over do it. Testing code for every possible platform takes of course much time and money.

    Another cost factor for code is organization overhead, if adding a new feature needs to go through each layer of the organization signing it off before a user can actually see it. Its of course more costly than the alternative of just pushing to production with all its faults.

    There is a big difference of short term cost and long term ones. I think LLMs reduce the short time cost immensely but may increase the long term costs. It will take some real long studies to really show the impact.

    • I think code was always expensive. If it seemed cheap, the cost was hidden somewhere else.

      When I started coding professionally, I joined a team of only interns in a startup, hacking together a SaaS platform that had relative financial success. While we were very cheap, being paid below minimum wage, we had outages, data corruption, db wipes, server terminations, unresolved conflicts making their way to production and killing features, tons of tech debt and even more makeshift code we weren't aware of...

      So yeah, while writing code was cheap, the result had a latent cost that would only show itself on occasion.

      So code was always expensive, the challenge was to be aware of how expensive sooner rather than later.

      The thing with coding agents is that it seems now that you can eat your cake and have it too. We are all still adapting, but results indicate that given the right prompts and processes harnessing LLMs quality code can be had in the cheap.

      • > The thing with coding agents is that it seems now that you can eat your cake and have it too. We are all still adapting, but results indicate that given the right prompts and processes harnessing LLMs quality code can be had in the cheap.

        It's cheaper but not cheap

        If you're building a variation of a CRUD web app, or aggregating data from some data source(s) into a chart or table, you're right. It's like magic. I never thought this type of work was particularly hard or expensive though.

        I'm using frontier models and I've found if you're working on something that hasn't been done by 100,000 developers before you and published to stackoverflow and/or open source, the LLM becomes a helpful tool but requires a ton of guidance. Even the tests LLMs will write seem biased to pass rather than stress its code and find bugs.

        • > I never thought this type of work was particularly hard or expensive though.

          Maybe not intrinsically hard, but hard because it's so boring you can't concentrate.

          > the LLM becomes a helpful tool but requires a ton of guidance. Even the tests LLMs will write seem biased to pass rather than stress its code and find bugs.

          ISTR some have had success by taking responsibility for the tests and only having the LLM work on the main code. But since I only seem to recall it, that was probably a while ago, so who knows if it's still valid.

        • > It's cheaper but not cheap

          It's quite cheap if you consider developer time. But it's only as cheap as you can effectively drive the model, otherwise you are just wasting tokens on garbage code.

          > LLM becomes a helpful tool but requires a ton of guidance

          I think this is always going to be the case. You are driving the agent like you drive a bike, it'll get you there but you need to be mindful of the clueless kid crossing your path.

          For some projects I had good results just letting the agent loose. For others I'd have to make the tasks more specific and granular before offloading to the LLM. I see nothing wrong with it.

      • So code was apparently cheap, but in fact it was expensive because it was low quality.

        Now with LLMs, code is cheap and it also has quality, therefore "quality code can be had in the cheap".

        Do you really believe this is the case? Why don't companies fire all their developers if they can have an algorithm that can output cheap and quality code?

        • Because cheap and quality code is only part of the story. The code needs to solve the right problem and that is a domain only a human can operate, at least for now. Back then when I was inexperienced I couldn't write good code, but I could sit with the company's CTO while he explained the domain, the challenges and the goal of the project. I could talk with domain experts and understand what the common solutions to the problems were. These are things that for an LLM to do would require untold amounts of context or a specialized model that understands the domain.

          But the thing is, there are many unknowns. We humans are very capable of adapting as we go. LLMs have a fixed data they were trained on and prompt engineering can only get you so far.

          I think anyone asking this with the intention of actually replacing humans with LLMs don't really understand neither humans nor LLMs. They are just talking money.

        • nthj
          We didn’t fire all our developers when we invented compilers either, and for much the same reason we didn’t stop hiring laborers when we first built ships and established overseas trade routes: business will always expand to meet its reach

          Many enterprises are currently exploring to see if they can invite developers to leverage AI tools—like they leveraged the compiler—to be more productive. To operate on a higher plane of agency, collaborating on what we should be building and not just technical execution. Those actively hostile or just checked out with the idea of relearning skills are being laid off. (Some unprofitable business sections are being swept up opportunistically too.) The idea that all developers would be fired if AI tools can write good code doesn’t meet the lessons of history

          • > Many enterprises are currently exploring to see if they can invite developers to leverage AI tools—like they leveraged the compiler—to be more productive. To operate on a higher plane of agency, collaborating on what we should be building and not just technical execution.

            The thing is, developers have been hired to automate process, and as for any professional doing a good job, that means the output should perform reliably. But now they are forcing us to use a tools that everyone knows is not reliable, but the onus is still on us to keep the same reliability. So do you see why we are not thrilled?

            It’s like providing a faulty piano (that shuffles the notes when a key is pressed) and expecting a good rendition of the Moonlight Sonata.

            Or a crane that will stall and drop its load randomly. It would have been sent to the scrapyard on the first day.

            • > "Or a crane that will stall and drop its load randomly. It would have been sent to the scrapyard on the first day."

              The only reason you have the concept that engines can "stall" is because people have bought engines that can stall by the hundreds of millions, instead of the earliest people refusing to buy them at all and all waiting for the perfect engine.

              Container ships can sink with all the containers lost at sea. Still used.

              Steam train engines could explode, derailing the train and killing some passengers and employees. Still used.

              Buildings can collapse. Still used.

              Pneumatic tyres can burst. Still used.

              Here[1] is Tom Scott using a recreation walking crane from the 13th century, a technology going back to Roman times, which has no evidence that it ever had brakes on it historically. Look at that and tell me you think the rope never snappped, the wood never broke, the walker never tripped and the thing never unreeled the load back to the ground with the walker severely injured, because if it went wrong builders would refuse to use it? No chance.

              Nothing functions like you're claiming; that's where we get the saying "don't let perfect be the enemy of good enough", as soon as stuff is better than not having it, people want to make use of it.

              [1] https://www.youtube.com/watch?v=pk9v3m7Slv8

              • You forgot to address the random aspect of the failure cases.

                Real world is chaotic, technology was always first about controlling, then improving said control. A lot of the risks in the situations you described have been brought down that the savings (time, money,…) are magnitude more than the cost of the failure.

                I’m not asking for perfection, but something good enough that we can demonstrate the savings outweigh the costs. So far there’s none. In fact, we are increasing it. And fast.

        • > Why don't companies fire all their developers if they can have an algorithm that can output cheap and quality code?

          Because it takes an experienced developer to get the machine to output cheap and quality code well enough to be useful.

          That developer is just a whole lot more valuable now, because they can do more work at a higher quality.

        • This what I really wonder, what is even the cost of code? Or what is real code quality.

          I know that things like “clean code” exists but I always felt that actual code quality only shows when you try adding or changing existing code. Not by looking at it.

          And the ability to judge code quality on a system scale is something I don’t think LLMs can do. But they may support developers in their judgment.

        • I don't know if you've heard, but there have been a large number of layoffs in the tech sector recently. Whether they're actually related to AI as executives claim, and not section 174 of the US IRS tax code in the BBB, is known only to them, but if your argument hinges on people having not been fired when there have been layoffs, you may need a different one.
          • I think a major contributor to the layoffs is companies hiring to much people around covid[1]. I cant find good stats for the years 2019-2026 besides looking at now and the past directly. There are some data for the ukranin side djinni[1][2] and for US IT job postings[3].

            I dont think AI is the reason for the layoffs. Its just easier to say "because of AI we are firing" than to say "because we overhired and its actually our fault".

            [1]https://djinni.substack.com/p/2021-in-review [2]https://blog.djinni.co/post/q1-analytics-en [3]https://fred.stlouisfed.org/series/IHLIDXUSTPSOFTDEVE

          • As you said, it's impossible to determine how many of the current layoffs are caused by AI, they probably also have a lot to do with the broader economic downturn. But you’re still missing the point, if companies truly have a black box that can produce cheap, high‑quality code as the GP put it, why don't they just fire 95% of their developers and keep only a small core of AI orchestrators?
    • > Wasn't writing code always cheap?

      No.

      If you’re a startup that wanted to ship your product, you were stuck hiring developers and waiting 6 months. I’m old enough to remember th “6-8 weeks” it was supposed to take to build the first version of Stackoverflow (hint it was more like several months).

      Creating a v1 of a product / feature is now cheap. When you’re a mature product and need to make complicated / iterative decisions - that’s not cheap and requires expertise to make good decisions.

      • Startups, famous for writing clean, quality code.
        • More like developers - infamous for not writing the shit code needed to test PMF :)
        • Not sure what relevancy that has to what you're responding to.
      • I made a SO clone in 1 week at a previous job (ok not 100% feature complete obviously but good enough to start using). This is by myself with no AI of course (10 years ago).

        With Claude Code today I could do it maybe 2x faster? Maybe not that much even though, a lot of time is spent not on purely mechanically typing out characters so it’s not a huge savings necessarily.

      • > I’m old enough to remember th “6-8 weeks” it was supposed to take to build the first version of Stackoverflow (hint it was more like several months).

        And they had to do this without help from Stack Overflow! :-)

  • There's a lot of misconception about the intrinsic economic value of 'writing code' in these conversations.

    In software, all the economic value is in the information encoded in the code. The instructions on precisely what to do to deliver said value. Typically, painstakingly discovered over months or years of iteration. Which is exactly why people pay for it when you've done it well, because they cannot and will not rediscover all that for themselves.

    Writing code, per se, is ultimately nothing more than mapping that information. How well that's done is a separate question from whether the information is good in the first place, but the information being good is always the dominant and deciding factor in whether the software has value.

    So obviously there is a lot of value in the mapping - that is writing the code - being done well and, all else being equal, faster. But putting that cart before the horse and saying that speeding this up (to the extent this is even true - a very deep and separate question) has some driving impact on the economics of software I think is really not the right way to look at it.

    You don't get better information by being able to map the information more quickly. The quality of the information is entirely independent of the mapping, and if the information is the thing with the economic value, you see that the mapping being faster does not really change the equation much.

    A clarifying example from a parallel universe might be the kind of amusing take about consultancy that's been seen a lot - that because generative AI can produce things like slides, consultancies will be disrupted. This is an amusingly naive take precisely because it's so clear that the slides in and of themselves have no value separate from the thing clients are actually paying for: the thinking behind the content in the slides. Having the ability to produce slides faster gets you nothing without the thinking. So it is in software too.

    • > You don't get better information by being able to map the information more quickly.

      This ties quite neatly into the concept of "cognitive debt" that's doing the rounds at the moment: https://simonwillison.net/tags/cognitive-debt/

      Cognitive debt accumulates when you move so fast on the implementation that your mental model of what the software does and how it works falls behind.

    • I very much appreciate this take. I will say though that I’ve had experience myself where using coding agents lead me to what I’d consider (in your terminology) a better mapping between information and code. Not because the agent was able to do things better than myself, but because, as my project grew and I got wiser on how to best map the information, it was incredibly fast for me to change the code in the right direction and do refactorings that I otherwise might not have gotten around to.
    • nicely said. I've been thinking a lot about how the bottleneck in the limit is getting your intent into the machine. Over time as AI improves it'll get better and better at extracting your intent just from situational context, but this only helps if you're willing to abdicate more and more judgement to the machine.

      Eventually you may get to the point where the machine has all the context about the scenario and all the knowledge about how you think, and so will always perfectly be aligned with your intent, but when that day comes the thing will have far surpassed your decision-making capability and you won't be in the loop anymore anyway.

  • Code generation is cheap in the same way talk is cheap.

    Every human can string words together, but there's a world of difference between words that raise $100M and words that get you slapped in the face.

    The raw material was always cheap. The skill is turning it into something useful. Agentic engineering is just the latest version of that. The new skill is mastering the craft of directing cheap inputs toward valuable outcomes.

    • > The new skill is mastering the craft of directing cheap inputs toward valuable outcomes.

      Strongly agree with this. It took me awhile to realize that "agentic engineering" wasn't about writing software it was about being able to very quickly iterate on bespoke tools for solving a very specific problem you have.

      However, as soon as you start unblocking yourself from the real problem you want to solve, the agentic engineering part is no longer interesting. It's great to be solving a problem and then realize you could improve it very quickly with a quick request to an agent, but you should largely be focused on solving the problem.

      Yet I see so many people talking about running multiple agents and just building something without much effort spent using that thing, as though the agentic code itself is where the value lies. I suspect this is a hangover from decades where software was valuable (we still have plenty of highly valued, unprofitable software companies as a testament to this).

      I'm reminded a bit of Alan Watts' famous quote in regards to psychedelics:

      > If you get the message, hang up the phone.

      If you're really leveraging AI to do something unique and potentially quite disruptive, very quickly the "AI" part should become fairly uninteresting and not the focus of your attention.

      • That's a great insight about iterating on bespoke tools. I have seen the most speed up when diving into new tools, or making new tools as AI can make the initial jump quite painless, and I can get straight to the problem solving. But I get barely any speedup using it on legacy projects in tools I know well. Often enough it slows me down so net benefit is nil or worse.

        Another commentor said it makes the easy part easy, and the hard part harder, which I resonate with at the moment.

        I am pretty excited by being able to jump deep into real problems without code being the biggest bottleneck. I love coding but I love solving problems more, and coding for fun is very different to coding for outcomes.

        • That's my observation / fear as well. It makes delivering something that sort of works easy. It makes doing that well more difficult by obscuring the problem domain from the humans and expanding the standard library of tools into patterns of using said standard library. Hope they're correct for your use case.

          There's also the question of the true cost of all the hardware, electricity, and potential output that's being tossed onto the pyres. We aren't getting the real Cortana from the books / games; we're getting GIR trained on the corpus of fallible human code, prompted by fallible humans.

      • It's funny that so many people are using AI and still hasn't really shown up in productivity numbers or product quality yet. I'm going to be really confused if this is still the case at the end of the year. A whole year of access to these latest agentic models has to produce visible economic changes or something is wrong.
        • >funny that so many people are using AI and still hasn't really shown up in productivity numbers or product quality yet.

          That's because the threat is now not other businesses, but your own users who decide to vibe-code their own "Claw" product instead of using your company's vibeslop, so there are no buyers for your single-week product. All these new harness developers are engaging in resume-driven development to save their own asses. The only ones that are not naked when the tide recedes are the ones that are able to jump to the next layer of abstraction on the infinite staircase, until the next tide comes five seconds later.

        • I used to think this was a sign that AI code isn't really useful, but I've changed my tune (also I believe these numbers have changed in the last few months).

          As an example: One of my most promising projects I was discussing with a friend and we realized together we could potentially use these tools to build a two person agency with no need to hire anyone ever. If this were to work, could theoretically make nice revenue and it shouldn't show up in any metric anywhere.

          Additionally I've heard of countless teams cancelling their contracts with outsourced engineers because cheap but bad coders in India are worse that an LLM and still cost more. I'm not sure if there's a number around this activity, but again, these type of changes don't show up in the usual places.

          My current belief is not that AI will replace traditional software engineering it will replace a good chunk of the entire model of software.

          • >One of my most promising projects I was discussing with a friend and we realized together we could potentially use these tools to build a two person agency with no need to hire anyone ever...My current belief is not that AI will replace traditional software engineering it will replace a good chunk of the entire model of software

            You're not following your last line to its logical conclusion regarding your own prospects: no one is going to buy the vibeslop your two person agency is selling because they'd rather create and maintain their own vibeslop instead of dealing with yours.

            If you follow some of your thoughts to their logical conclusion you'll realize the parent is right: there will be limited productivity that ends up fueling the economy when nobody is buying each other's vibeslop.

            • We're not selling vibe slop, the "vibe slop" tools which work for one person enable of automation of tasks for the services we sell. Whether or not we use AI behind the scenes is entirely irrelevant to the service we're providing other than that it allows our margins to be higher and our speed of implementation to be faster.

              I absolutely agree that it's not logical to think "oh we'll sell our AI stuff", that's the old model (which is just a variation on SaaS). I suspect a lot of HNers can't imagine a "product" that isn't code, but that's not at all what I'm describing.

              The products that most people on HN have traditionally built are used by other companies to make money by allowing those processes to be scaled. AI, in many new cases, eliminates the need for a 'software' middle man. The case I'm describing is "I know how to make money doing X if only I could scale it up with out hiring people" and my offering is "I can scale it up without hiring people".

              This is increasingly where I think the future of work is headed, and it's more than fine if you aren't convinced.

              • > it allows our margins to be higher and our speed of implementation to be faster

                Faster than what? You will be faster than your previous self, just like all of your competitors. Where’s the net gain here? Even if you somehow managed to capture more value for yourself, you’ve stopped providing value to 5-10x that many employees who are no longer employed.

                When costs approach zero on a large scale, margins do not increase. Low costs = you’re not paying anyone = your competitors aren’t paying anyone = your customers no longer have money = your revenue follows your costs straight to zero.

                Companies that provide physical services can’t scale without hiring. A one-man “crew” isn’t putting a roof on a data center.

                I want to be wrong. Tell me why you think any of this is wrong.

          • > If this were to work, could theoretically make nice revenue and it shouldn't show up in any metric anywhere.

            Except production GDP, the standard measure of economic activity.

            • Correct me, but if two people create a SAAS that can replace a 50 people SAAS, compete on price and the competitor is forced out of the market, wouldn’t this show up as an reduction in GDP? Efficiency (GDP/time_worked) should be up though, and AFAIK it isn’t.
              • 2 people are now producing what took 50 people previously.

                What are the 48 other people doing now? Presumably some other economic activity.

              • Yes, when prices of goods and services go down so will GDP. I've not seen evidence of the prices of SaaS going down in the past few years.
          • >One of my most promising projects I was discussing with a friend and we realized together we could potentially use these tools to build a two person agency with no need to hire anyone ever. If this were to work, could theoretically make nice revenue and it shouldn't show up in any metric anywhere.

            potentially...if this were to work...theoretically

            shouldn't show up? I would worry that something with so many variables wouldn't show up.

        • My intuition from talking to people across different parts of the industry, is that adoption at bigger companies is really limited or slow, or totally banned. Additionally some developers are not seeing it help their specific roles all that much anyway. This is hard to level with success other people are having, but software is a super broad discipline which I think explains a lot of the mixed success stories.

          It seems to depend a lot on the industry and niche you're in, working at an agency I get experience across many different projects and industries and sometimes you are just at the edge of AIs training and it can get very unhelpful. Noting many if not most companies are working on proprietary code in donain specific problems, that isn't all that surprising either.

        • This is actually an old syndrome with technology. It takes a longt ime for the effect to be reliably measured. Famously, it took many years for the internet itself to show up in significant productivity gains (if the internet is actually useful why don't the numbers show that? - a common comment in the 1990s and 2000s). So it seems to me we're just the usual dynamic here. Productivity in trillion-dollar economies do not turn on a dime
          • >Famously, it took many years for the internet itself to show up in significant productivity gains

            Yeah but the actual productivity gains that the internet and software tools introduced has had diminishing returns after a while.

            Like, are people more productive today when they use Outlook and Slack than they were 20 years ago when using IBM Lotus Notes and IBM Sametime? I'm not. Are people more productive with the Excel of today than with Excel 2003/2007? I'm not. Is Windows 11 and MacOS Tahoe making people more productive than Windows 7 and Snow Leopard? Not me. Are IDEs of today offering so much more productivity boost than what Visual Studio, CodeWarrior and Borland Delphi did back in the day? Don't think so.

            To me it seems that at least on the productivity side, we've mostly been reinventing the wheel "but in Rust/Electron" for the last 15 or so years, and the biggest productivity gains came IMHO from increased compute power due to semiconductor advancement, so that the same tasks finished faster today than 20 years ago, but not that the SW or the internet got so much more capable since then.

            • I think the biggest productivity improvements in software development over the last ~20 years came from open source (NPM install X / pip install Y save so much time constantly reinventing wheels) and automated tests.
        • I think if you're doing front-end development AI is good. If you are reading a db and sending a json to said webpage AI is decent, if you are doing literally anything else AI is next to useless.

          At least, in my own experience.

        • I wouldn't say it hasn't shown up. The number of ShowHN's per weekend has definitely gone up, and while that isn't rigorous scientific proof, I'd consider is a leading edge indicator of something. Unfortunately, we as an industry have yet to agree on anything approaching a scientific measure of productivity, other than to collectively agree that Lines of Code is universally agree that LoC is terrible. Thus even if someone was able to quantify that, say, they're having days where they generate 5000 LoC when previously they were getting O(500) LoC, that's not something we could agree upon as improved productivity.

          So then the question is, lis there anything other than feels to say productive has or has not gone up? What would we accept as actual evidence one way or another? Commits-per-day is similarly not a good measure either. Jira tickets and tshirts sizes? We don't have a good measure, so while ShowHN's per weekend is equally dumb, it's also equally good in the bag of lies, damn lies, and statistics.

          • There was a post a few days ago about how the quality of SnowHN had gone down with people asking how they could block this category of submissions - so I wouldn't be too quick to equate an increase in ShowHN with anything positive.
            • Whether or not it is positive is a matter of opinion, but it undeniable that the ShowHNs exist.
    • Or another way of looking at it: just because digging a ditch became cheap and fast with the backhoe doesn't mean you can just dig a bunch of ditches and become rich.
      • Yeah but there were a lot less ditch diggers in the world after the invention of the backhoe
        • If true, only because people knew where to dig and did it with purpose.
        • > Yeah but there were a lot less ditch diggers in the world after the invention of the backhoe

          As a specialization? Sure. But the ditch diggers moved since to machine operators, handymen and the like.

          In the past there were sysadmins. Do we have less software engineers since sysadmins ceased to be a thing?

          • > As a specialization? Sure. But the ditch diggers moved since to machine operators, handymen and the like.

            All of them? What if they liked digging ditches?

            > In the past there were sysadmins. Do we have less software engineers since sysadmins ceased to be a thing?

            Software Engineers were never sysadmins in the past, you’re thinking DevOps maybe?

            • The software engineers who like "digging ditches" are going to have a bad time in the new agentic engineering world, unfortunately.

              Here "digging ditches" corresponds to somebody else figuring out the detailed requirements and specification and handing it to the engineer to transcribe into code.

              That's what the coding agents replace. Thankfully for most engineers I've worked with that's only a small part of their overall jobs, albeit one of the most time consuming.

    • Indeed: The act of actually typing the code into an editor was never the hard or valuable part of software engineering. The value comes from being able to design applications that work well, with reasonable performance and security properties.
      • It wasn't the hard or valuable part of software engineering, but it was a very time-consuming part. That's what's interesting about this new era - the time-consuming-but-easy bit has suddenly stopped being time-consuming.
        • Agreed, often see cope from managers along the line of “writing the code was never the bottleneck”. Well, sure felt like it.
          • For most people who can type with more than 2 fingers, thinking what to type is slower than typing it.
      • Then why did most software fail to do that even before the advent of LLMs?
        • Because designing systems that work well is difficult. It takes years of experience to develop the muscle memory behind quality systems architecture. Writing the code is an implementation detail (albeit a large one).
        • Are we sure it's not failing anymore after the advent of LLMs?
        • Because coding bootcamps and CS programs were churning out squillions of people who could type the code but had poor design and analytical skills, because there was a time where being able to implement Dijkstra on a whiteboard would get you 400k at a FAANG.
          • And you think these people will now produce better results with the assistance of an LLM that was trained on their work?
            • No, that's the opposite of what I think.

              Bootcamp grads are basically obsolete now. The real skill has always been the ability to make good design decisions and that's still the case in the LLM era.

              • > The real skill has always been the ability to make good design decisions and that's still the case in the LLM era.

                For now maybe yes but the goal is totally removing the human from the decision loop regarding technical stuff.

              • > Bootcamp grads are basically obsolete now.

                I beg to differ. I know for a fact that some companies started hiring people with LLM experience, whose only expertise is spending all Copilot enterprise account tokens on their first week at the job and proceed to whine that the lack of tokens was stifling their creativity.

                Say what you may about boot camps, but at least the people getting hired could do things and understand what they are doing.

    • I think we’re falling into a trap of overestimating the value of incrementally directing it. The output is all coming from the same brain so what stops someone just getting lucky with a prompt and generation that one-shots the whole thing you spent time breaking down and thinking about. The code quality will be the same, and unless you’re directing it to the point where you may as well be coding the old way, the decision-making is the same too.
    • fmbb
      Raising $100M doesn’t even mean you have a good idea or an idea people like or an idea you can even make money on.
  • arkh
    Every modern (and not so modern) software development method hinge on one thing: requirements are not known and even if known they'll change over time. From this you get the goal of "good" code which is "easy to change code".

    Do current LLM based agents generate code which is easy to change? My gut feeling is a no at the moment. Until they do I'd argue code generated from agents is only good for prototypes. Once you can ask your agent to change a feature and be 100% sure they won't break other features then you don't care about how the code looks like.

    • All the hype is on how fast it is to produce code. But the actual bottleneck has always been the cost of specifying intent clearly enough that the result is changeable, testable, and correct AND that you build something that brings value.
    • > Once you can ask your agent to change a feature and be 100% sure they won't break other features then you don't care about how the code looks like.

      That bar is unreasonably high.

      Right now, if I ask a senior engineer to change a feature in a mature codebase, I only have perhaps 70% certainty they won't break other features. Tests help, but only so far.

      • This bar only seems high because the bar in most companies is already unreasonably low. We had decades of research into functional programming, formal methods and specification languages. However, code monkey culture was cheaper and much more readily available. Enterprise software development has always been a race to the bottom, and the excitement for "vibe coding" is just the latest manifestation of its careless, thoughtless approach to programming.
      • But if push come to shove any other engineer can come in and debug your senior engineer code. That's why we insist on people creating easy to change code.

        With auto generated code which almost no one will check or debug by hand, you want at least compiler level exactitude. Then changing "the code" is as easy as asking your code generator for new things. If people have to debug its output, then it does not help in making maintainable software unless it also generates "good" code.

      • There are limits how badly can such senior screw up, or more likely forget some corner case situation. And he/she is on top of their own code and whole codebase and getting better each time, changing only whats needed, reverting unnecessary changes, seeing bigger picture. That's (also) seniority.

        Llm brings an illusion of that, a statistical model that may or may not hit what you need. Repeat the question twice and senior will be better at task the second time. LLM will produce simply different output, maybe.

        Do you feel like you have a full control over whats happening here? Business has absolutely insatiable lust for control, and IT systems are an area of each business that C-suite always feel they have least control of.

        Reproducibility and general trust is not something marginal but core of good deliveries. Just read this thread - llms have 0 of that.

    • I am constantly getting LLMs to change features and fix bugs. The key is to micromanage the LLM and its context, and read the changes. It's slower that vibe coding but faster than coding by hand, and it results in working, maintainable software.
      • A study last year concluded that while AI coding feels faster it actually isn't. At least in mid 2025.

        https://news.ycombinator.com/item?id=44522772

        • The comments explain the nuance there pretty well:

          > This study had 16 participants, with a mix of previous exposure to AI tools - 56% of them had never used Cursor before, and the study was mainly about Cursor.

          > My intuition here is that this study mainly demonstrated that the learning curve on AI-assisted development is high enough that asking developers to bake it into their existing workflows reduces their performance while they climb that learing curve.

          Giving people a tool, that have no experience with it, and expecting them to be productive feels... odd?

        • That's a good point. Myself is the easiest person to fool.

          I knocked together a quick analysis of my commit graphs going back several years, if you're interested: https://mccormick.cx/gh/

          My average leading up to 2023 was around 2k commits per year. 2023 I started using ChatGPT and I hit my highest commits so far that year at 2,600. 2024 I moved to a different country, which broke my productivity. I started using aider at the end of 2024 and in 2025 I again hit my highest commits ever at 2,900. This year is looking pretty solid.

          From this it looks to me like I'm at least 1.4x more productive than before.

          As a freelancer I have to track issues closed and hours pretty closely so I can give estimates and updates to clients. My baseline was always "two issues closed per working day". These are issues I create myself (full stack, self-managed freelancer) so the average granularity has stayed roughly constant.

          This morning I closed 8 issues on a client project. I estimate I am averaging around 4 issues per working day these days. I know this because I have to actually close the issues each day. So on that metric my productivity has roughly doubled.

          I believe those studies for sure. I think there is nuance to using these tools well, and I think a lot of people are going backwards and introducing more bugs than progress through vibe coding. I do not think I have gone backwards, and the metrics I have available seem to agree with that assessment.

        • 6 months ago in AI development is too old to be relevant.
    • > Do current LLM based agents generate code which is easy to change?

      They do. I am no longer writing code, everything I commit is 100% generated using an agent.

      And it produces code depending on the code already in my code-base and based on my instructions, which tell it about clean-code, good-practices.

      If you don't get maintainable code from an LLM it's for this reason: Garbage in, garbage out.

      • Doesn’t this preclude that you already know how to produce good code? How will anyone in the future do this when they haven’t actually programmed?
    • This is the brake on “AI will replace all developers”.

      Coding is a correctness-discovery-process. For a real product you need to build to know the right thing. As the product matures those constraints increase in granularity to tighter bits of code (security, performance, etc)

      You can have AI write 100% of the code but more mature products might be caring about more and more specific low level requirements.

      The time you can let an agent swarm just go are cases very well specificed by years of work (like the Anthropic C compiler)

    • I'd add in "code is easier to write than it is to read" - hence abstraction layers designed to present us with higher level code, hiding the complex implementations.

      But LLMs are both really good at writing code _and_ reading code. However, they're not great at knowing when to stop - either finishing early and leaving stuff broken, over-engineering and adding in stuff that's not needed or deciding it's too hard and just removing stuff it deems unimportant.

      I've found a TDD approach (with not just unit tests but high-level end-to-end behaviour-driven tests) works really well with them. I give them a high-level feature specification (remember Gherkin specifications?) and tell it to make that pass (with unit tests for any intermediate code it writes), make sure it hasn't broken anything (by running the other high-level tests) then, finally, refactor. I've also just started telling it to generate screenshots for each step in the feature, so I can quickly evaluate the UI flow (inspired by Simon Willison's Rodney tool).

      Now I don't actually need to care if the code is easy to read or easy to change - because the LLM handles the details. I just need to make sure that when it says "I have implemented Feature X" that the steps it has written for that feature actually do what is expected and the UI fits the user's needs.

    • > Do current LLM based agents generate code which is easy to change?

      Yes, if that's your goal and you take steps to achieve that goal while working with agents.

      That means figuring out how to prompt them, providing them good examples (they'll work better in a codebase which is already designed to afford future changes since they imitate existing patterns) and keeping an eye on what they're doing so you can tell them "rewrite that like X" when they produce something bad.

      > Once you can ask your agent to change a feature and be 100% sure they won't break other features

      That's why I tell them to use red/green TDD: https://simonwillison.net/guides/agentic-engineering-pattern...

    • We won't be able to be sure of 100% with LLMs but maybe proper engineering around evals get us to an acceptable level of quality based on the blast radius/safety profile.

      I'd also argue that we should be pushing towards tracer bullets as a development concept and less so prototypes that are nice but meant to be thrown away and people might not do that.

      The clean room auto porting, after a messy exploratory prototyping session would be a nice pattern, nonetheless.

    • No. They’re great at slopping out a demo, but god help you if you want minor changes to it. They completely fall apart.
    • Irrelevant because you are not going to make new changes by hand. You will use AI for that.
  • Each line of code is a liability.

    I think it’s funny that we’re all measuring lines of code now and smiling.

    It was/is expensive because engineers are trying to manage the liability exposure of their employers.

    Agents give us a fire hose of tech debt that anyone can point at production.

    I don’t think the tool itself is bad. But I do think people need to reconsider claims like this and be more careful about building systems where an unaccountable program can rewrite half your code base poorly and push it to production without any guard rails.

    • This is the under-discussed part. We spent decades building authorization layers around code deployment -- review gates, CI checks, staging, rollback. That infrastructure exists because code changes are high-consequence operations that compound.

      "Code is cheap" means generation is cheap. But the authorization to ship it remains expensive for very good reason. Removing the generation bottleneck without replacing the judgment layer is a control gap, not a productivity gain.

      The parallel is giving someone a master key because they're fast at opening doors. Speed was never the constraint that justified the lock.

    • Writing code is expensive. Maintaining code is expensive. Whatever way, it ain't cheap.
  • I'm going to shill my own writing here [1] but I think it addresses this post in a different way. Because we can now write code so much faster and quicker, everything downstream from that is just not ready for it. Right now we might have to slow down, but medium and long term we need to figure out how to build systems in a way that it can keep up with this increased influx of code.

    > The challenge is to develop new personal and organizational habits that respond to the affordances and opportunities of agentic engineering.

    I don't think it's the habits that need to change, it's everything. From how accountability works, to how code needs to be structured, to how languages should work. If we want to keep shipping at this speed, no stone can be left unturned.

    [1]: https://lucumr.pocoo.org/2026/2/13/the-final-bottleneck/

    • fmbb
      I don’t think we can expect all workers at all companies to just adopt a new way of working. That’s not how competition works.

      If agentic AI is a good idea and if it increases productivity we should expect to see some startup blowing everyone out of the water. I think we should be seeing it now if it makes you say ten times more productive. A lot of startups have had a year of agentic AI now to help them beat their competitors.

      • ej88
        We're already seeing eye-watering, blistering growth from the new hot applied AI startups and labs

        Imo the wave of top down 'AI mandates' from incumbent companies is a direct result of the competitive pressure, although it probably wont work as well as the execs think it will

        that being said even Dario claims a 5-20% speedup from coding agents, 10x productivity only exists in microcosm prototypes, or if someone was so unskilled oneshotting a localhost web app is a 10x for them

        • "eye-watering, blistering growth from the new hot applied AI startups and labs"

          Could you give us a few examples?

          • claude code 1B+ arr

            ant 10xing ARR, oai

            harvey legora sierra decagon 11labs glean(ish) base10(infra) modal(infra) gamma mercor(ish) parloa cognition

            regulated industries giving these companies 7/8-fig contracts less than 2 years from incorporation

          • Claude Cowork was apparently built in less than two weeks using Claude Code, and appears to be getting significant usage already.
            • Only a personal anecdote, but the humans I know that have used it are all aware of how buggy it is. It feels like it was made in 2 weeks.

              Which gets back to the outsourcing argument: it’s always been cheap to make buggy code. If we were able to solve this, outsourcing would have been ubiquitous. Maybe LLMs change the calculus here too?

            • That's certainly a good example of a tool developed quickly thanks to AI assistance.

              But coding assistance tools must themselves be evaluated by what they produce. We won't see significant economic growth through using AI tools to build other AI tools recursively unless the there are companies using these tools to make enough money to justify the whole stack.

              I believe there are teams out there producing software that people are willing to pay for faster than they did before. But if we were on the verge of rapid economic growth, I would expect HN commenters to be able to rattle these off by the dozen.

        • AI has been a lifesaver for my low performing coworkers. They’re still heavily reliant on reviews, but their output is up. One of the lowest output guys I ever worked with is a massive LinkedIn LLM promoter.

          Not sure how long it’ll last though. With the time I spend on reviews I could have done it myself, so if they don’t start learning…

          • > With the time I spend on reviews I could have done it myself, so if they don’t start learning…

            Then? Your job is still to review their code. If they are your coworker, you can not fire them.

            • Then just start rubber-stamping their code. Say you "vibe" read it.
      • OpenClaw went from first commit in late November to Super Bowl commercial (it's meant to be the tech behind that AI.com vaporware thing) in February.

        (Whether you think OpenClaw is good software is kind of beside the point.)

        • OpenClaw is not going to be a thing in 6 months. The core idea might exist but that codebase is built on a house of cards and is being replicated in 10% of the code.

          I don’t think anyone is arguing against code agents being good at prototypes, which is a great feat, but most SWE work is built on maintaining code over time.

        • fmbb
          It’s very much not beside the point. Productivity is measured in how much value you get out from the hours your workers put in.
          • But that only gets you to a philosophical argument about what "value" is. Many would argue that being able to get your thing into a Super Bowl commercial is extremely valuable. I definitely have never built anything that did.

            It's very much imperfect, but the only consistently agreed upon and useful definition of "value" we have in the West is monetary value, and in that sense, we have at least a few major examples of AI generating value rapidly.

            • OK but that also means VR was a success, and web 3, and NFTs.
              • Well, yes, these were definitely a success for some. And I personally still believe that VR will be a success in the longer-term.

                In any case, I agree with the grandparent post about the distinction between being successful and good.

    • Nice to see you here (Just reached out on bluesky over sandboxing - gandolin). I follow your work and agree and am hoping that you and others who have well earned audiences based on awesome open source work, can help with the advocacy on mental shifts, not just for developers but also non Devs that become builders.

      I'm very focused on their minimalistic building experience as a way to make me and other traditional developers, not the bottleneck and empowering them end to end.

      I think AI evals [1] are a big part of that route and hope that different disciplines can finally have probable product design stories [2] instead of there being big gaps of understanding between them.

      [1] https://alexhans.github.io/posts/series/evals/measure-first-...

      [2] https://ai-evals.io

    • One of the most interesting aspects is when LLMs are cheap and small enough so that apps can ship with a builtin one so that it can adjust code for each user based on input/usage patterns.
      • The clear intent is to stop allowing regular people to be able to compute...anything. Instead, you'll be given a screen that only connects to $LLM_SERVER and the only interface will be voice/text in which you ask it to do things. It then does those things non-deterministically, and slower than they would be done right now. But at least you won't have control over how it works!
        • Weather or not the intent is as nefarious as you suggest, that type of UI is going to be a boon for a lot of people. Most people on the planet are incredibly computer illiterate.
      • If this could ever happen, there will be no point in GUI apps anymore, your AI assistant or what have you will just interact with everything on your behalf and/or present you with some kind of master interface for everything.

        I don't see a bunch of small agents in the future, instead just one per device or user. Maybe there will be a fleeting moment for GUI/local apps to tie into some local, OS LLM library (or some kind of WebLLM spec) to leverage this local agent in your app.

        • >If this could ever happen, there will be no point in GUI apps anymore, your AI assistant or what have you will just interact with everything on your behalf and/or present you with some kind of master interface for everything.

          sort of how the hammer is the most useful tool ever and all we have to do is to make every thing that needs doing look like a nail.

        • Agents will still have to communicate with each other, the communication protocols, how data is stored, presented and queried will be important for us to decide?

          Will we stop using web browsers as we understand them today in the next few decades in favor of only interacting with agents? Maybe.

        • a new kind of operating system, that instead of having all those annoying apps, will just be an agent, that does whatever stuff you need|want|it can - why not? But still, we gonna need to be able to maintain many contexts, access remote servers, local file system, sort, present data, be it local or remote. One screen, one button, philosopher's stone made of silicon
      • I've heard this referenced multiple times and I have yet to hear the value be clearly articulated. Are you saying that every user would eventually be using a different app? Wouldn't it eventually get to the point that negates the need for the app developer anyways since you would eventually be unable to offer any kind of support, or are we just talking design changing while the actual functionality stays the same? How would something like this actually behave in reality?
        • I don't know!

          These are valid points, taken to the extreme we will have apps that cannot be supported.

          In short term, we already have SQL/reports being automated. Lovable etc is experimenting with generating user interfaces from prompts, soon we will have complete working apps from a prompt. Why not have one core that you can expand via a prompt?

          I am currently studying and depending heavily on Anki, its been amazing to use Claude Code to add new functionality on the fly. Its a holy mess of inconsistent/broken UX but it so clearly gives me value over the core version. Sometimes it breaks, but CC can usually fix it within a prompt or two.

        • > I've heard this referenced multiple times and I have yet to hear the value be clearly articulated.

          Me too, and I see this as _incredibly_ wasteful.

      • LISP returns!
    • >but medium and long term we need to figure out how to build systems in a way that it can keep up with this increased influx of code.

      Why? Why do we need to "write code so much faster and quicker" to the point we saturate systems downstream? I understand that we can, but just because we can, does'nt mean we should.

      • > to the point we saturate systems downstream

        But that's point of TFA, no? Now that writing code is no longer the bottleneck, the upstream and downstream processes have become the new bottlenecks, and we need to figure out how to widen them.

        As I see it, the end goal for all of this is generating software at the speed of thought, or at least at the speed of speech. I want the digital butler to whom I could just say - "I'm not happy with the way things happened to day, please change it so that from here on, it'll be like x" - and it'll just respond with "As you wish", and I'll have confidence that it knows me well enough and is capable enough to have actually implemented the best possible interpretation of what I asked for, and that the few miscommunications that do occur would be easy to fix.

        We're obviously not close that yet, but why shouldn't we build towards it?

        • > Now that writing code is no longer the bottleneck

          I think it’s contestable that writing the code was ever the main bottleneck.

          > As I see it, the end goal for all of this is generating software at the speed of thought, or at least at the speed of speech.

          The question is what distinguishes that from having AGI, and if the answer is “nothing”, then that will change the whole game entirely again.

          • Oh, absolutely, my vision depends on AGI (and maybe even ASI), and I definitely agree that it'll be a whole new ball game.
      • If we want to continue to ship at that speed we will have to. I’m not sure if we should, but seemingly we are. And it causes a lot of problems right now downstream.
        • We were already rushing and churning products and code of inferior quality before AI (let's e.g. consider the sorry state of macOS and Windows in the past decade).

          Using AI to ship more and more code faster, instead of to make code more mature, will make this worse.

          • I want to use AI to ship more and more code faster and better. If AI means our product quality goes down we should figure out better ways to use it.
            • I'm betting on it meaning the product quality going down - and technical debt increasing, which will be dealt with more AI in a downward spiral. Meanwhile college CS majors wont ever bother learning the basics (as AI will handle their coursework, and even their hobby work). Then future AI will train on previous AI output, with the degredation that brings...
            • Shouldn't you want to ship less code that does more? Since when was LoC the relevant benchmark for engineering?
              • Less code isn't as important as it used to be, because the cost of maintaining (simple) code has gone down as well.

                With coding agent projects I find that investing in DRY doesn't really help very much. Needing to apply the same fix in two places is a waste of time as a human. An agent will spot both places with grep and update them almost as fast as if there was just one.

                It's another case where my existing programming instincts appear to not hold as well as I would expect them to.

                • When you talk about maintaining code, do you mean having the LLM do it and you maintain a write-only codebase? Because if you're reading the code yourself and you have a bloated tangled codebase it would make things much harder right?

                  Is the goal basically a codebase where your interactions are mediated through an LLM?

                  • The goal is "good code" based on my list of criteria, which includes both "simple and minimal" and "the design affords future changes".

                    A bloated table codebase isn't good code, because it's harder to understand and makers changes to than the equivalent non-bloated codebase.

                    But... bloat does look a little bit different when you no longer need to optimize code for saving humans typing time.

                    Much of the confusing code I've encountered during my career has been confusing because it had too many layers of indirection, which happened because someone was applying DRY too aggressively because they didn't want to duplicate even the smallest pieces of logic in more then one place.

                    Good coding agents will only DRY like that if you tell them to.

    • The focus is on downstream, but is upstream ready for this speed up?

      The linked blog post draws comparisons to the industrial revolution however in the industrial revolution the speed up caused innovation upstream not downstream.

      The first innovation was mechanical weaving. The bottleneck was then yarn. This was automated so the bottleneck became cotton production, which was then mechanised.

      So perhaps the real bottleneck of being able to write code faster is upstream.

      Can requirements of what to build keep up with pace to deliver it?

    • Totally agree - that's what I was trying to get at with "organizational habits". The way we plan, organize and deliver software projects is going to radically change.

      I'm not ready to write about how radically though because I don't know myself!

    • I was having this conversation at work, where if the promise of AI coding becomes true and we see it in delivery speed, we would need to significantly increase the throughput of all other aspects of the business.
    • > If we want to keep shipping at this speed

      Do we? Spewing features like explosive diarrhea is not something I want.

    • The linked article is worth reading alongside this one.

      The thing I'd add from running agents in actual production (not demos, but workflows executing unattended for weeks): the hard part isn't code volume or token cost. It's state continuity.

      Agents hallucinate their own history. Past ~50-60 turns in a long-running loop, even with large context windows, they start underweighting earlier information and re-solving already-solved problems. File-based memory with explicit retrieval ends up being more reliable than in-context stuffing - less elegant but more predictable across longer runs.

      Second hard part: failure isolation. If an agent workflow errors at step 7 of 12, you want to resume from step 6, not restart from zero. Most frameworks treat this as an afterthought. Checkpoint-and-resume with idempotent steps is dramatically more operationally stable.

      Agree it's not just habits - the infrastructure mental model has to change too. You're not writing programs so much as engineering reliability scaffolding around code that gets regenerated anyway.

  • I basically fully agree with this. I am not sure how to handle the ramifications of this in my day to day work yet. But at least one habit I have been forming is sometimes I find that even though the cost of writing code is immensely cheap, reviewing and validating that it works in certain code bases (like the millions of line mono repo I work in at my job) is extremely high. I try to think through, and improve, our testability such that a few hundred line of code change that modifies the DB really can be a couple of hours of work.

    Also, I do want to note that these little "Here is how I see the world of SWE given current model capabilities and tooling" posts are MUCH appreciated, given how much you follow the landscape. When a major hype wave is happening and I feel like I am getting drowned on twitter, I tend to wonder "What would Simon say about this?"

    • > I find that even though the cost of writing code is immensely cheap, reviewing and validating that it works in certain code bases (like the millions of line mono repo I work in at my job) is extremely high.

      That is my observation as well. Churning code is easy, but making sure the code is not total crap is a completely new challenge and concern.

      It's not like prior to LLMs code reviews didn't required work. Far from it. It's just that how the code is generated in a completely different way, and in some cases with barely any oversight from vibecoders who are trying to punch way above their weight. So they generate these massive volumes of changes that fail in obvious and subtle ways, and the flow is relentless.

      • What tremendously helps is asking the LLM to add a lot a lot explanations by adding comments to each and every line or function.

        You can remove those comments afterwards if you feel they are too much but it helps a lot the reviewing.

        More a trick than a silver bullet but it's nice.

        • > What tremendously helps is asking the LLM to add a lot a lot explanations by adding comments to each and every line or function.

          No, it doesn't. It's completely useless and unhelpful. These machine-generated comments are only realizations of the context that already outputted crap. Dumping volumes of this output adds more work to reviewers to parse through to figure out the mess presented by vibecoders who didn't even bothered to check what they are generating.

          • You don't get me. I don't say that those are good comments, I even say that you should probably delete them afterwards.

            But as you say, they are realization of the context of the LLM. Their role is not helping you to understand what the code is doing, but how the LLM understood the problem and how it tried to solve it. Now you can compare its own understanding with yours.

            Now I need to add context myself : I'm not talking about vibe coding entire apps, here adding verbosity wouldnt help a lot. My main usage of LLMs is at $JOB where I need to execute "short" tasks into codebases I barely know most of the times, that's where I use this trick. It also have the side benefit to help me understand the codebase better.

  • I’d like to add an obvious point that is often overlooked.

    LLMs take on a huge portion of the work related to handling context, navigating documentation, and structuring thoughts. Today, it’s incredibly easy to start and develop almost any project. In the past, it was just as easy to get overwhelmed by the idea of needing a two-year course in Python (or any other field) and end up doing nothing.

    In that sense, LLMs help people overcome the initial barrier, a strong emotional hurdle, and make it much easier to engage in the process from the very beginning.

  • The cost of code never lived in the typing — it lived in the intent, the constraints, and the reasoning that shaped it. LLMs make the typing cheap, but they don’t make the reasoning cheap. So the economics shift, but the bottleneck doesn’t disappear.
    • For most non-hobby project, the cost of code was in breaking a working system (whether by a bona fide bug, or a change in some unspecified implicit assumption). That made changes to code incredibly expensive - often much more than the original implementation.

      It sounds harsh, but over the lifetime of a project, 10-lines/person/day is often a high estimate of the number of lines produced. It’s not because humans type so slow - it is because after a while, it’s all about changing previously written lines in ways that don’t break things.

      LLMs are much better at that than humans, if the constraints and tests are reasonably well specified.

      • > if the constraints and tests are reasonably well specified.

        if they are, then why would a human be so slow? You're not comparing the same situation.

        • The human in that case is not "so slow", but at the current state it is slower than an LLM as simple as that.

          The difference comes in confidence that the solution works and can be maintained in the future, but in terms of purely making the decisions and applying the changes an LLM is faster when it has all the required infos available

        • Because humans need to type with a keyboard, then click around with a mouse.

          In that time the LLM has made a change, ran tests, committed, pushed, checked that the CI build failed, looked at the CI logs, fixed the issue and the PR is now passing.

          • In what world is an LLM generating text faster than you can type?

            They are certainly slower than me at text generation.

            • Wow you must be fast at typing!

              Can you beat this one? https://chatjimmy.ai/

              (It is to be fair running a pretty trash model, a quantized Llama 3.1 8B from ~18 months ago)

    • Agree, and I'd add: the feedback loop between decision and consequence got dramatically shorter. You can test an architectural hypothesis in hours instead of weeks. That part is genuinely powerful.

      But faster feedback also means bad decisions propagate faster. The skill isn't generating three implementations in parallel -- it's knowing which one won't page you at 3am. That judgment comes from having been paged at 3am enough times to recognize the patterns.

      • > The skill isn't generating three implementations in parallel -- it's knowing which one won't page you at 3am.

        100% this.

    • It doesn't disappear but it does easy up some instances of fighting configuration, documentation, syntax or even comparing three approaches that are similar but you don't know their full effect.

      I think it's a very fun space, finally being able to empower many people who in the past wouidve been bottlenecked unless they were using very simple tools for their domain and upskilled enough. Those 2 things will still be true, but the speed at which some things can happen at the exploration and other layers has seen a significant speedup.

      Other problems like entropy/slop, security, system testing, lack of automation fundamentals arise but it's a good problem to start tackling.

      I'm very focused on evals [1] because is what allows me to not to be the bottleneck with economists who I want to empower to code end to end and I'd like that mental shift to happen for anyone becoming a builder so non traditional developers and developers by trade have a common language for product building [2]. That part of speaking to different audiences and combating hype that promises to do everything for you Vs the intent that's actually needed is hard, but trying gets you to advance quite a lot.

      [1] https://alexhans.github.io/posts/series/evals/measure-first-...

      [2] https://ai-evals.io

    • > LLMs make the typing cheap, but they don’t make the reasoning cheap.

      LLMs lower the cost of copy/pasting code around, or troubleshooting issues using standard error messages.

      Instead of going through Stack Overflow to find how to use a framework to do some specific thing, you prompt a model. You don't even need to know a thing about the language you are using to leverage a feedback loop.

      LLMs lower the cost of a multitude of drudge work in developing software, such as having to read the docs to learn how a framework should be used to achieve a goal. You still need to know what you are doing, but you don't need to reinvent the wheel.

  • > any time our instinct says "don't build that, it's not worth the time" fire off a prompt anyway, in an asynchronous agent session where the worst that can happen is you check ten minutes later and find that it wasn't worth the tokens.

    They are right about new habits needed. And this is where everyone should start. Sometimes a quick prompt has killed 5 hours of meetings to discuss if it were worth it.

  • I don't agree that the code is cheap. It doesn't require a pipeline of people to be trained and that is huge, but it's not cheap.

    Tokens are expensive. We don't know what the actual cost is yet. We have startups, who aren't turning a profit, buying up all the capacity of the supply chain. There are so many impacts here that we don't have the data on.

    • Writing code is cheaper than ever. Maintaining it is exactly the same as ever and it scales with the LOC.

      Code is still liability but it's undeniable that going from thought to running code is very cheap today.

      • You completely ignored the post you're replying to.

        To recap, the author disagrees that writing code is cheap, because we've collectively invested trillions of dollars and redirected entire supply chains into automating code generation. The externalities will be paid for generations to come by all of humanity; it's just not reflected in your Claude subscription.

        • GP is not totally ignoring the post he replied to: we have models that are basically 6-months behind closed SOTA models and that we can run in the cloud and we fully know how much these costs to run.

          The cat is out of the bag: compute shall keep getting cheaper as it's always been since 60 years or something.

          It's always been maintenance that's been the killer and GP is totally right about that.

          And if we look at a company like Cloudflare who basically didn't have any serious outage for five years then had five serious outages in six months since they drank the AI kool-aid, we kinda have a first data point on how amazing AI is from a maintenance point of view.

          We all know we're generating more lines of underperforming, insecure, probably buggy, code than ever before.

          We're in for a wild ride.

          • > compute shall keep getting cheaper as it's always been since 60 years or something

            past success is not a strong indicator for future success.

      • Maintaining it is becoming more costly. The increasing burden of review on FOSS maintainers is one example. AWS going down because an agent decided to re-write a piece of critical infrastructure is another. We are rapidly creating new kinds of liability.
        • This burden of review will go down as FOSS maintainers involve AI more.
          • unlikely, FOSS is mostly driven by zero-cost maintenance but AI tools needs money to burn. So only few FOSS project will receive sponsored tools and some definitely reject to use by ideological reasons (for example it could be considered as poison pill from copyright perspective).
    • > We don't know what the actual cost is yet.

      We kind of do? Local models (thought no state of the art) set a floor on this.

      Even if prices are subsidized now (they are) that doesn't mean they will be more expensive later. e.g. if there's some bubble deflation then hardware, electricity, and talent could all get cheaper.

  • Here's an easy to understand example. I've been playing EvE Online and it has an API with which you can query the game to find information on its items and market (as well as several other unrelated things).

    It seems like a prime example for which to use AI to quickly generate the code. You create the base project and give it the data structures and calls, and it quickly spits out a solution. Everything is great so far.

    Then you want to implement some market trading, so you need to calculate opportunities from the market orders vs their buy/sell prices vs unit price vs orders per day etc. You add that to the AI spec and it easily creates a working solution for you. Unfortunately once you run it it takes about 24 hours to update, making it near worthless.

    The code it created was very cheap, but also extremely problematic. It made no consideration for future usage, so everything from the data layer to the frontend has issues that you're going to be fighting against. Sure, you can refine the prompts to tell it to start modifying code, but soon you're going to be sitting with more dead code than actual useful lines, and it will trip up along the way with so many more issues that you will have to fix.

    In the end it turns out that that code wasn't cheap at all and you needed to spend just as much time as you would have with "expensive code". Even worse, the end product is nearly still just as terrible as the starting product, so none of that investment gave any appreciable results.

    • been there. for these kinds of projects i never start with code, i start with specs. i sometimes spend days just working on the specs. once the specs are clear i start coding, which is completely different monster. basically not that different from a common workflow (spec > ticket > grooming > coding; more or less), but everything with AI.
      • Yeah, I make sure to spend my time on the hard problems and where you need to design for the future. I use AI up to around method level after that, have it do the drudge work of typing up tedious text, or to complete required boilerplate etc.
      • I do the same, plus add tests from early on. New features then naturally are accompanied by more tests.
    • Did you tell it to consider future usage? Have you tried using it to find and remove dead code? In my experience you can get very good code if you just do a few passes of AI adversarial reviews and revisions.
    • Sounds like every other software project to me …
  • code has never been expensive.

    ridiculous asks are expensive. Not understanding limitations of computer systems are expensive.

    The main problem is, and always will be communication. engineers are in general are quick to say "that won't work as you described" because they can see the steps that it takes to get there. Sales guys (CEOs) live a completely different world and they "hear" "I won't do that" from technical types. It's the ultimate impedance mismatch and the subject of countless seminars.

    AI writing code at least reduces the cost of the inevitable failures, but doesn't solve the root problem.

    Successful business will continue be those who's CTO/CEO relationship is a true partnership.

  • Writing code has been cheap for a while now.

    Writing good software is still expensive.

    It's going to take everybody a while to figure that out (just like with outsourcing)

    • Yeah, it’s odd watching the outsourcing debate play out again. The results are gonna be the same.

      Which is a shame, cause I think LLMs have a lot more use for software dev than writing code. And that’s really what’s going to shift the industry - not just the part willing to cut on quality.

    • Dollars to donuts that at some point someone is going to discover that senior engineers spend just as much time reviewing, fixing, and dealing with blowups caused by, shitty AI-generated code produced by more junior coders....as they did providing various forms of mentoring of said junior coders, except those junior coders become better developers in the latter case, whereas the AI generates the same shitty results or even worse, inconsistent quality code.
  • It's like the allegory of the retired consultant's $5000 invoice (hitting the thing with a hammer: $5, knowing where to hit it: $4995).

    Yeah, coding is cheaper now, but knowing what to code has always been the more expensive piece. I think AI will be able to help there eventually, but it's not as far along on that vector yet.

    • Possibly even more important than knowing where to hit it (what to code), is knowing where not to hit it (what not to code). Hitting the thing in the wrong place can lead to catastrophe. Making a code change you don't need can blow up production or paint your architecture into a corner.

      AIs so far seem to prefer addition by addition, not addition by subtraction or addition by saying "are you sure?".

      This doesn't mean that "code is cheap" is bad. Rather, it means that soon our primary role will be to guide AIs to produce a high proportion of "code that was cheap", while being able to quickly distinguish, prevent, and reject "cheap code".

  • Partially why I’m surprised there isn’t more focus on coding harnesses that lean towards strong typing / testing / quasi formal verification type paradigms

    If you could funnel it through something like that then the ability to generate vast amounts of code is a lot more commercially useful

    • 100%

      This is what explains the difference in using apps like Claude Code versus almost any other harness/wrapper.

      And the model can be the same, but if the harness sucks then the usefulness of the harness+model tanks.

      It's like harness * model = usefulness.

      • Can’t say I’ve notice much of a difference switching from CC to opencode?
  • Code is cheap is the same as saying "Buying on credit is easy". Code is a liability, not an asset.
    • Code you can’t just throw away is a liability because you have to keep supporting it / servicing it. Claude Code and friends also change that part of the cost equation:

      You might not get gcc/llvm level optimization from a newly built compiler - but if you had a home-built one, which took $15,000/month engineer to support (for years!) you can now get a new one for $20,000 every 3 months, for a 50% cost saving, each time changing your requirements (which you couldn’t do before).

      Code used to be a liability, like a car or an apartment for the average person. Now it’s a liability, like a car or apartment for Bill Gates.

    • I would normally agree, but I think the "code is a liability" quote assumes that humans are reading and modifying the code. If AI tools are also reading and modifying their own code, is that still true?
      • You have to be able to express the change you want in natural language. This is not always possible due to ambiguity.

        Next to that, eventually you run into the same issue that we humans run into: no more context windows.

        But we as software engineers have learned to abstract away components, to reduce the cognitive load when writing code. E.g., when you write file you don't deal with syscalls anymore.

        This is different with AI. It doesn't abstract away things, which means you requesting a change might make the AI make a LOT of changes to the same pattern, but this can cause behavior to change in ways you haven't anticipated, haven't tested, or haven't seen yet.

        And because it's so much code to review, it doesn't get the same scrutiny.

      • What happens when there’s a service outage and you cannot debug code without an agent?
        • Switch to a rival service that doesn't have an outage. There are at least half a dozen competent hosted LLM vendors for coding now (Anthropic, OpenAI, Gemini, Mistral, Kimi, MiniMax, Qwen, ...)
        • Like any service outage out of their control, people will find other things to do until it’s over.
          • Afternoon latte and useless meetings won’t do themselves!
    • >Code is a liability, not an asset

      Then "AI" code is even more of a liability.

    • I think you mean to say, "code you don't understand is a liability, not an asset"

      But please correct me if I'm wrong.

      • No I said what I meant. Code is a liability, though to your point, code you don't understand is an even bigger liability.

        Even if I understand all my code, when I go to make changes, if it's 100k lines of code vs 2k lines of code, it's going to take more time and be more error prone.

        Even if I understand all my code, the intern I hired last week won't and I'll have to teach it to them.

        Even if I understand all my code, I don't remember everything all the time and I can forget about an edge case handed in thousands of lines of code.

        Even if I understand all my code, I don't understand my co-workers code, and they don't understand mine.

        Even if I understand all my code, I might not want to work for this company the rest of my life.

        • I've worked at so many places in my career that "not understanding code" is not an excuse. It is a skill to be able to read and follow code and get up to speed quickly, even on shit codebases. But "AI" generated code makes that so much more difficult, and the "AI" isn't going to walk you through it, and neither will your new coworkers. We aren't in a race to the bottom with "AI", we're in a speedrun to the bottom, and I don't think it's going to end up going too well for whatever developers are left in a few years.
  • I'm very curious to see how this will affect the job market. All the recent CS grads, all the coding bootcamp graduates - where would they end up in? And then there's medium/senior engineers that would have to switch how they work to oversee the hordes of AI agents that all the hype evangelists are pushing on the industry.

    Not an employee market, that's for sure.

    • >> oversee the hordes of AI agents

      This is the thing I don't really get. I enjoy tinkering with AI and seeing what it comes up with to solve problems. But when I need to write working code that does anything beyond simple CRUD, it's faster for me to write the code than it is to (1) describe the problem in English with sufficient detail and working theory, then (2) check the AI's work, understand what it's written, de-duplicate and dry it out.

      I guess if I skipped step 2, it might save time, but it would be completely irresponsible to put it into production, so that's not an option in any world where I maintain code quality and the trust of my clients.

      Plus, having AI code mixed into my projects also leaves me with an uneasy sense of being less able to diagnose future bugs. Yes, I still know where everything is, but I don't know it as well as if I'd written it myself. So I find myself going back and re-reviewing AI-written code, re-familiarizing myself with it, in order to be sure I still have a full handle on everything.

      To the extent that it may save me time as an engineer, I don't mind using it. But the degree to which the evangelists can peddle it to the management of a company as a replacement for human coders seems highly correlated with whether that company's management understood the value of safe code in the first place. If they didn't, then their infrastructure may have already been garbage, but it will now become increasingly unusable garbage. At some point, I think there will be a backlash when the results in reality can no longer be denied, and engineers who can come in and clean up the mess will be in high demand. But maybe that's just wishful thinking.

      • I'm in the same boat. Too often for me it feels easier to write code that I want to see by myself instead of opening some AI tool where I would have to describe what I need in plain English. After which I'd still have to review the code to make sure it does do what was requested.

        Perhaps you have to be certain type of person or work in a peculiar company where second step (review) can be ignored as long as AI says that it does. Hardcore YOLO life.

    • the top % of talent is still extremely hard to get, perhaps moreso

      saw an article recently where every sector is seeing a reduction in IT/devs except for tech and ai companies

      if your company is in a sector where eng is a cost-center and the product is not directly tied to your engineers / your company is pushing for efficiency it's an employer's market

      • I don't think it makes sense to consider top % of the talent - relative to the total amount of engineers I imagine it would be minuscule.

        It's the rest who will have to deal with shrinking number of positions, higher competition and possibly decrease in compensation due to said competition.

        • Unfortunately agreed, i see swe (already) going the way of finance

          most jobs are okay comp 9-6s with a small group of elite talent making extremely outsized comp

          more emphasis on pedigree, exclusive hiring pipelines from top schools

  • Sponsored by: Teleport — Secure, Govern, and Operate AI at Engineering Scale

    Gee what a surprise.

  • This is such a surface level take, it's embarrassing. Unless he's talking about the impression in society, but he is not.

    Writing code is a social act even though few actually read it the code, they experience the result of that code.

    Maybe, just maybe Simon means "code is disposable now", because some shortcut taker can spin up an apparent duplicate by coaxing and pleading with AI.

    That is not a future worth participating in, that's intellectual begging death, because that will create an environment of worthless nonsense.

    • Did you read the whole thing or just the headline?

      What did you think of my list of characteristics of "good code"?

      • I re-read to make sure I was not reacting off the cuff. Your list of good code practices is obvious what we want, but that is not going to happen with the way you describe working with AIs.

        You don't seem to be aware of cognitive load; when you say:

        "any time our instinct says "don't build that, it's not worth the time" fire off a prompt anyway, in an asynchronous agent session where the worst that can happen is you check ten minutes later and find that it wasn't worth the tokens."

        That is going to fracture a person's understanding, leading to mentally fatigue them. People, and I see this coming from you too Simon, seem to be treating software development like it's a sprint, when it is absolutely a marathon. One that requires consideration before acting, and now with LLMs the common act of figuring it out along the way is now dangerous because they will happily guide one to create a CMS when all they want is a one off function.

        • I've written quite a bit about cognitive debt and load recently https://simonwillison.net/tags/cognitive-debt/. It's a very real issue - working with these tools is exhausting if you don't figure out how to pace it, and I've not figured out how to pace it yet.

          I'm currently expecting this will be a temporary thing, brought on by the significant new abilities of the November model releases.

      • What are you thinking about for the new best practices for software engineering?
        • That's the question I'm hoping to answer as I write the rest of this guide, which I expect to take several months: https://simonwillison.net/2026/Feb/23/agentic-engineering-pa...

          I'm not sure anyone has a confident answer to that yet though - I certainly don't.

          • Simon, I'd be interested in a voice conversation with you on this topic. I've developed my own chain-of-thought system that is very different from what others are doing. Mine is a Socrate Agent that leads the developer, does not write their code, but guides them to become a better developer. It completely opposite the direction the larger industry seems to be going. Augmented synergy between the AI and developer, not the AI coding and the human overseeing.
  • The key is what we consider good code. Simon’s list is excellent, but I’d push back on this point:

    > it does only what’s needed, in a way that both humans and machines can understand now and maintain in the future

    We need to start thinking about what good code is for agents, not just for humans.

    For a lot of the code I’m writing I’m not even “vibe coding” anymore. I’m having an agent vibe code for me, managing a bunch of other agents that do the actual coding. I don’t really want to look at the code, just as I wouldn’t want to look at the output of a C compiler the way my dad did in the late ’80s.

    Over the last few decades we’ve evolved a taste for what good code looks like. I don’t think that taste is fully transferable to the machines that are going to take over the actual writing and maintaining of the code. We probably want to optimize for them.

  • I think there's a good parallel with AI images - generating pictures has gotten ridiculously easy and simple, yet producing art that is meaningful or wanted by anyone has gotten only mildly easier.

    Despire the explosion of AI art, the amount of meaningful art in the world is increased only by a tiny amount.

    • But the amount of pleasing useful art has gone up x1000; If I had a blog, I would now have access to art that would be a perfect fit for my words, whereas 5 years ago, I would have to do with a my own (talent-lacking) doodles.

      Would some people prefer no art/illustration to AI generated art? Sure. But even more would prefer no art to my doodles.

  • Software is rarely an end unto itself.

    Thus, "Code" is a liability; Producing excess liabilities 'cheaply' is still a loss.

    You only ever want to have just enough code to accomplish the task at hand.

    LLMs may help you get to just enough faster, but you'll only know that you are there after doing the second 90%.

  • This fact is opening the floodgates of low-end products, which are somehow better than nothing, but are embarrassing to use.
    • True, however as these products have been designed and coded by LLMs from the ground up in 2025+, they are generally using modern (typed even) languages, the latest version of third party libraries, usually have documentation of sorts... sometimes they even have test suites.

      As such, they can often be improved as easily as one can prompt, which is much faster and easier than before. Notably in the FOSS world where one had to ask the maintainer, get ghosted for a year and have them go back with a "close: wontfix (too tedious)".

      • I've tried very earnestly to use opus 4.5 to get rid of some backlog tasks that were too tedious to do manually. It turns out that they're still extremely tedious because I have to make every single non trivial decision for the model, unless I don't care one iota about the long term sustainability of the code base. And by long term, I mean more than a week. They're good for saving keystrokes or doing fuzzy searches for me. "Design"? No, that is an anthropomorphism.
      • Better languages do not necessarily mean better architectural decisions, or even better performance, unless the humans pressure for that and burn tokens on that. With no engineer in the room, more technical issues will be left unnoticed and unaddressed.

        Compare it to visual arts. With a guidance form an artist, AI tools can help create wonderful pictures. Without such guidance, or at least expert prompting, a typical one-shot image from Gemini is... well, at best recognizable as such.

  • As someone who uses AI coding agents daily, the thing that still surprises me is how much faster I can iterate on the "problem space" vs the "solution space." I can try 5 different approaches to a problem in the time it used to take to implement one. The code itself is cheap now, but the expensive part—figuring out what to build—hasn't changed. If anything, being able to quickly test ideas makes it more important to be clear on what you're actually trying to solve.
  • Writing code is cheap.

    Owning code is getting more and more expensive.

    SWEs sacrificed their jobs so that SREs could have unlimited job security.

  • > Code has always been expensive. Producing a few hundred lines of clean, tested code takes most software developers a full day or more. Many of our engineering habits, at both the macro and micro level, are built around this core constraint.

    > At the macro level we spend a great deal of time designing, estimating and planning out projects, to ensure that our expensive coding time is spent as efficiently as possible. Product feature ideas are evaluated in terms of how much value they can provide in exchange for that time - a feature needs to earn its development costs many times over to be worthwhile!

    Maybe I am spending my life working at the wrong corporations (not FAANG/direct tech related), but that doesn't match at all my experience. The `design` phase was reduced to something more akin to a sketch in order to get faster iterating products. Obviously that now, as you create and debate over more iterations, the time for writing code is increased (as you built more stuff that is discarded). What is that discarded time used for? Well, it's the way new people learn the system/business domain. It's how we build the knowledge to support the product in production. It's how the business learns what are the limits/features, why they are there, what they can offer, what they must ask the regulators etc.

    Realistically, if you only count the time required to develop the feature as described, is basically nothing. Most of the time is spent on edge-cases that are not written anywhere. You start coding something and 15m in you discover 5-10 cases not handled in any way. You ask business people, they ask other people. You start checking regulation docs/examples, etc. etc. Maybe there are no docs available, so you just push a version, and test if you assumptions are correct (most likely not...so go again and again). At the end of this process everyone gains a better understanding on how the business works, why, and what you can further improve.

    Can AI speedrun this? Sure, but then how will all the people around gain the knowledge required to advance things? We learn through trial and error. Previously this was a shared experience for everyone in the business, now it becomes more and more a solitary experience of just speaking with AI.

  • > We need to build new habits

    In my case, testing and documentation becomes even more important.

    I’m currently rewriting a server backend that was originally written as a “general-purpose” server, and is very complex. It works extremely well, is robust and secure, but way overkill for my current application.

    I'm using an LLM to write a lot of the code. The LLM-written code is quite verbose, and I’m having to learn to just accept that, as it also works well. For a while, I would go in and rewrite the code, but I'm learning to stop doing that. If there's a problem; even an obvious one, I am learning to ask the LLM to fix it, instead of going in and doing it, myself.

    Right now, I am writing a headerdoc for the server that is going to be hundreds of lines long. It is a detailed, pedantic description of the API and internal structure of the application.

    Its primary audience is LLMs. I need to make sure that future analysis understands exactly why the server does what it does, as well as what it does. The current server is a first step in a (probably years-long) process of migration away from the original server design.

    It does seem to be coming along well.

  • > Here's what I mean by "good code": [...]

    What a fantastic list. I'll be saving it to show the junior developers.

    My only nitpick is that "reliability" should have been a point by itself. All the other "ilities" can be appropriately sacrificed in some context, but I've never seen unreliable software being praised for its code quality.

    Which is part of why LLMs are so frustrating. They're extremely useful and extremely unreliable.

  • ● When everyone has access to the same models, and those models can produce working software quickly and cheaply, the code itself stops being a moat. A reasonably skilled person can now build what used to take a team of engineers months to ship.

    ● A competitor can reach feature parity in days. The thing that used to protect a business, the sheer effort required to build it, is mostly gone.

    ● Think of it like oil. If every property owner had a well that was cheap to operate, the price of oil would fall toward the price of water. The resource is abundant, the extraction is easy, and the margin disappears. Software features are heading in the same direction.

    https://designexplained.substack.com/p/the-moat-has-moved

  • > It’s simple and minimal

    This. All LLM code I saw so far was lots of abstraction to the point that it’s hard to maintain.

    It is testable for sure, but the complications cost is so high.

    Something else that is not addressed in the article is working within enterprise env where new technologies are adopted in much slower paces compared to startups. LLMs come with strange and complicated patterns to solve these problems, which is understandable as I would imagine all training and tuning were following structured frameworks

    • Because it has been trained on Java class spaghetti.

      When it’s trained on enough APL/K code, you’ll get minimal abstraction.

  • When did we ever measure the value of code by quantity, not quality? The author is misguided.
    • Did you read my list of characteristics of "good code" further down the article?

      That's all about quality, not quantity.

  • > "Code has always been expensive. Producing a few hundred lines of clean, tested code takes most software developers a full day or more. Many of our engineering habits, at both the macro and micro level, are built around this core constraint."

    Well, yes and no. While producing two screens worth of high quality code, a.k.a. software engineering was always expensive, "coding" as such, as in producing Nth React SPA or merely typing out code that you engineered in your head, was never that expensive - most of the work is applying existing patterns. But either way, as you wrote the code yourself, you mostly had a consistent mental model of how your code should work and the key contribution was evolving this model, first in your head, then in typing out code. Now here comes the real problem for the LLMs: I think most of us would be fine if the LLMs could actually just type out the code for us after we engineered it in our heads and explained it to the LLM in English language. Alas, they do produce some sort of code, but not always, or often enough not in a way we desribed it. So unfortunately "AI" boosters like Simon are reverting back to the argument of fast code generation and an appeal to us as unwilling adopters, to "change our ways", as it shows they have no real advantage of the LLMs to put forward - its only ever the "coding" speed and an appeal to us as professionals to "adapt", i.e. serve as cleaners of LLM sh*t. Where is the superintelligence we were promised and single-person-billion dollar unicorns, unique use cases etc? Are you telling us again these are just advanced text generators, Simon?

    • > I think most of us would be fine if the LLMs could actually just type out the code for us after we engineered it in our heads and explained it to the LLM in English language. Alas, they do produce some sort of code, but not always, or often enough not in a way we desribed it.

      That's exactly what they do for me - especially since the November model releases (GPT-5.1, Opus 4.5).

      > Where is the superintelligence we were promised and single-person-billion dollar unicorns, unique use cases etc? Are you telling us again these are just advanced text generators, Simon?

      I never promised anyone a superintelligence, or single-person-billion dollar unicorns.

      I do think these things are just advanced text generators, albeit the word "just" is doing a whole lot of work in that sentence.

      • > That's exactly what they do for me - especially since the November model releases (GPT-5.1, Opus 4.5).

        I mean it's inherently impossible, given the statistical nature of LLMs, so I am not sure are you claiming this out of ignorance or other interests, but again, what you claim is impossible due to the very nature of LLMs.

        • It's impossible for human developers too. Natural language descriptions of a program are either much more painful and time consuming to write than the code they describe, or contain some degree of ambiguity which the thing translating the description into code has to resolve (and the probability of the resolution the entity writing the description would have chosen and the one entity translating it into code chose matching perfectly approach zero).

          It can make sense to trade off some amount of control for productivity, but the tradeoff is inherent as soon as a project moves beyond a single developer hand writing all of the code.

          • I agree - the whole BS of "Hottest new programming language is English" is complete nonsense. There is something about writing the code directly from your mind that skips over the "language circuits" and makes it much more precise. Perhaps as humans with education we obtain an ability to think in programming language itself I suppose? It's probably similar to what happens in the mind of a composer or painter. This is why the natural language will never be the interface the big "AI" companies are making it to be.
        • It's impossible, and yet I experience it on a daily basis.

          YMMV. I've had a lot of practice at prompting.

          • What you've experienced is different from what was originally mentioned though. Even with the best human developers, you can't provide a normal natural language prompt and get back the exact code you would have written, because natural language has ambiguities and the probability that the other person (or LLM) will resolve all of them exactly as you would is approaches zero.

            Collaborating with someone/something else via natural language in a programming project inherently trades control for productivity (or the promise of it). That tradeoff can be worth it depending on how much productivity you gain and how competent the collaborator is, but it can't be avoided.

          • > YMMV. I've had a lot of practice at prompting.

            Ah, the old "you suck at prompting" angle again, isn't it? If you're going to shill this hard, at least come up with something new and original, this is sounding more than desperate.

            • Most people suck at playing the piano. Most people suck at prompting coding agents. If you practice either of those things you'll get better at them.

              I really don't understand the "stop telling me I'm holding it wrong" argument. You probably are holding it wrong!

              Is this born out of some weird belief that "AI" is meant to be science fiction technology that you don't ever need to learn how to use?

              That would help explain why conversations like this are full of people who claim to get great results and other people who say every time they've tried it the results have been terrible.

              • > I really don't understand the "stop telling me I'm holding it wrong" argument. You probably are holding it wrong!

                I can't speak for others, but from my end it really seems like there's no actual way to detect whether someone is holding it right or wrong until after the implications for LLMs are known. If someone is enthusiastic about LLMs, we don't see claims that they're holding it wrong. It's only if an LLM project fails, or someone tries them and concludes they don't work as well as proponents say, that the accusations come out, even if the person in question had been using these tools for a long time and previously been a supporter. This makes it seem like "holding it wrong" is a post hoc justification for ignoring evidence that would tend to contradict the pro-LLM narrative, not a measurable fact someone's LLM usage.

              • > Most people suck at playing the piano. Most people suck at prompting coding agents. If you practice either of those things you'll get better at them.

                It would be funny, if by now I weren't convinced you are pushing these false analogies on purpose. The key difference between a piano and LLMs being, the piano will produce the same sounds to a same sequence of keys. Every single time. A piano is deterministic. The LLMs are not, and you know it, which makes your constant comparison of deterministic with non-deterministic tools sound a bit dishonest. So please stop using these very weak analogies.

                > I really don't understand the "stop telling me I'm holding it wrong" argument. You probably are holding it wrong!

                Right, another weak argument. Writing English language paragraphs is not a science you seem to imply it is. You're not the only person using the LLMs intensively for the last years, and it's not like there this huge secret to using them - after all they use natural language as their primary interface. But that's besides the point. We're not discussing if they are hard or easy to use or whatever. We are discussing if I should replace the magnificent supercomputer already placed in my head by mother nature or God or Aliens or whatever you believe in, for a very shitty, downgraded version 0.0.1 of it sitting in someone's datacenter, all for the sake of sometimes cutting some corners by getting that quick awk/sed oneliner or some boilerplate code? I don't think that's a worthy tradeoff, especially when the relevant reports indicate an objective slowdown, which probably also explains the so-called LLM-fatigue.

                > Is this born out of some weird belief that "AI" is meant to be science fiction technology that you don't ever need to learn how to use?

                No, actually it is born out of the weird belief which your sponsors have been either explicitly or implicitly promoting, now for the 4th year, in various intensities and frequencies, that the LLM technology will be equal to a "country of PhDs in a datacenter". All of this based on the super weird transhumanist ideology a lot of the people directly or indirectly sponsoring your writing actively believe in. And whether you like it or not, even if you have never implied the same, you have been a useful helper by providing a more "rational" sounding voice, commenting on the supposed incremental improvements and progress and what not.

                • Fine, if you don't like the piano analogy:

                  Most people suck at falconry. If you practice at falconry you'll get better at it.

                  Falcons certainly aren't deterministic.

                  > it's not like there this huge secret to using them - after all they use natural language as their primary interface

                  That's what makes them hard to use! A programming language has like ~30 keywords and does what you tell it to do. An LLM accepts input in 100+ human languages and, as you've already pointed out many times, responds in non-deterministic ways. That makes figuring out how to use them effectively really difficult.

                  > We are discussing if I should replace the magnificent supercomputer already placed in my head by mother nature or God or Aliens or whatever you believe in, for a very shitty, downgraded version 0.0.1 of it sitting in someone's datacenter

                  We really aren't. I consistently argue for LLMs as tools that augment and amplify human expertise, not as tools that replace it.

                  I never repeat the "country of PhDs" stuff because I think it's over-hyped nonsense. I talk about what LLMs can actually do.

                  • > Falcons certainly aren't deterministic.

                    Well falcons are not deterministic and are trained to do something in the art of falconry, yes. Still I fail to see an analogy here as it is the falcon gets trained to execute a few specific tasks triggered by specific commands. Much like a dog. The human more or less needs to remember those few commands. We don't teach dogs and falcons to do everything do we ? Although we do teach specific dogs do to specific tasks in various domains. But no one ever claimed Fido was superintelligent and that we needed to figure him out better.

                    > That's what makes them hard to use! A programming language has like ~30 keywords and does what you tell it to do. An LLM accepts input in 100+ human languages and, as you've already pointed out many times, responds in non-deterministic ways. That makes figuring out how to use them effectively really difficult.

                    Well yes and no. The problem with figuring out how to use them (LLMs) effectively is exactly caused by their inherent un-predictability, which is a feature of their architecture further exacerbated by whatever datasets they were trained on. And so since we have no f*ing clue as to what the glorified slot machines might pop out next, and it is not even sure as recently measured, that they make us more productive, the logical question is - why should we, as you propose in your latest blog, bend our minds to try and "figure them out" ? If they are un-predictable, that means effectively that we do not control them, so what good is our effort in "figuring them out"? How can you figure out a slot machine? And why the hell should we use it for anything else other than a shittier replacement for pre-2019 Google? In this state they are neither augmentation nor amplification. They are a drag on productivity and it shows, hint - AWS December outage. How is that amplifying anything other than toil and work for the humans?

                    • I've found that using LLMs has had a very material effect on my productivity as a software developer. I write about them to help other people understand how I'm getting such great results and that this is a learnable skill that they can pick up.

                      I know about the METR paper that says people over-estimate the productivity gains. Taking that into account, I am still 100% certain that the productivity gains I'm seeing are real.

                      The other day I knocked out a custom macOS app for presenting web-pages-as-slides in Swift UI in 40 minutes, complete with a Tailscale-backed remote presenter control interface I could run from my phone. I've never touched Swift before. Nobody on earth will convince me that I could have done that without assistance from an LLM.

                      (And I'm sure you could say that's a bad example and a toy, but I've got several hundred more like that, many of which are useful, robust software I run in production.)

  • If coding is so cheap, I hope people start vibing Rust. If the machine can do the work, please have it output in a performant language. I do not need more JS/Python utilities that require embarrassing amounts of RAM.
    • I’ve used Claude to write custom firmware in Rust for an ESP32-driven desktop clock.

      Turned it into a Stripe revenue dashboard and notifier.

      Even bought a couple more, flashed them, and gave to my cofounders, complete with AI written (personally tested, though) setup instructions!

    • It's already happening, particularly with "Ladybird Browser adopts Rust" [0] being at the top of HN today. It's now feasible to quickly iterate on a system's design with a dynamic language like Python, and then, once you're happy with the design, have AI rewrite it into something like Rust or Zig. I can even foresee a future where we intentionally maintain two parallel implementations, with machine-defined translation between them, such that we're able to do massive changes on the higher level implementation in minutes, and then once we finish iterating, have it run overnight to reimplement (or rewrite) it in the performant language. A bit like the difference between a unoptimized debugging version of a project, and the highly optimized one, but on steroids.

      [0] https://news.ycombinator.com/item?id=47120899

    • Worth reiterating due to the skyrocketing costs of RAM.
    • The sad reality is it will likely be the older languages (I tend to see Ruby vibed a lot) just because there is so much more to train on.
    • With a bit of AI sprinkled in, Rust code can surely also waste gigabytes of RAM on "Hello World" ;)
  • I like using the analogy of 'living in a small apartment' when building systems with a small team. You need to choose carefully what furniture you can fit into your apartment, and that choice depends a lot on how you live your live. Do you want a large table to host friends, or a comfortable couch to fall asleep on in front of the TV? If you get both the space will probably be cluttered.

    The same applies to a small software project - you need to choose what features you can fit. And while the cost of building is part of the consideration, I'd say most of it is about the cost of maintaining features, not only in code, but also in product coherence and other incidental 'costs' like documentation and user support.

    Be careful of building too many features and ending up being overwhelmed by the maintenance, or worse, diluting the product's value to a point where you loose users.

    • Are you familiar with the cathedral vs the bazaar?
      • To some extent, although I've never read the actual text. Care to elaborate, I don't want to infer things on your behalf?
        • Curious if your mental model is similar or how you see them being different.
  • > Here's what I mean by "good code":

    > [...]

    > - It’s simple and minimal - it does only what’s needed, in a way that both humans and machines can understand now and maintain in the future.

    But do the humans need to actually understand the code? A "yes" means the bottleneck is understanding (code review, code inspection). A "no" means you can go faster, but at some risk.

    • OpenAI is implying that code may no longer be human readable in some circumstances.

      > The resulting code does not always match human stylistic preferences, and that’s okay. As long as the output is correct, maintainable, and legible *to future agent runs*, it meets the bar.

      https://openai.com/index/harness-engineering/

    • > But do the humans need to actually understand the code? A "yes" means the bottleneck is understanding (code review, code inspection). A "no" means you can go faster, but at some risk.

      I always thought of things like code reviews as semi pseudo-science in most cases. I've sat through meetings where developers obviously understand the code that they are reviewing, but where they didn't understand anything about the system as a whole. If your perfect function pulls on 800 external dependencies that you trust. Trust because it's too much of a hazzle to go through them. I'd argue that in this situation you don't understand your code at all. I don't think it matters and I certainly don't think I'm better than anyone else in this regard. I only know how things work when it matters.

      If anything, I think AI will increase human understanding without the need to write computer unfriendly code like "Clean Code", "DRY" and so on.

      • Code reviews are pseudo-science now? Computer unfriendly code? What are you talking about? Do you understand that this babble makes zero sense ? Are you one of those product managers who recently learned to vibe-code? If so, make sure your latest Replit project does not delete your production database..
      • > If anything, I think AI will increase human understanding

        How?

  • This is the first "chapter" in a not-quite-book I've started working on - I have an introductory post about that here: https://simonwillison.net/2026/Feb/23/agentic-engineering-pa...

    The second chapter is more of a classic pattern, it describes how saying "Use red/green TDD" is a shortcut for kicking the coding agent into test-first development mode which tends to get really good results: https://simonwillison.net/guides/agentic-engineering-pattern...

    • I believe the ChatGPT code has a bug, in that it accepts three spaces or tabs before a code fence, while the Google Markdown spec says up to three spaces, and does not allow a tab there.

      I also see that the tests generated by ChatGPT are far too few for the code features implemented. The cannot be the result of actual red/green TDD where the test comes before the feature is added.

      For examples, 1) the code allows "~~~" but only tests for "```", 2) there are no tests when len(fence) < fence_len nor when len(fence) > fence_len, and 3) there are no tests for leading spaces.

      There's also duplicate code. The function _strip_closing_hashes is used once, in the line:

        text = _strip_closing_hashes(m.group("text")).strip()
      
      The function is:

        def _strip_closing_hashes(s: str) -> str:
            s = s.rstrip()
            # remove trailing " ###" style closers
            s = re.sub(r"[ \t]+#+\s*$", "", s).rstrip()
            return s
      
      The ".rstrip()" is unneeded as the ".strip()" does both lstrip and rstrip.

      I think that rstrip() should be replaced with a strip(), the function renamed to "_get_inline_content", and used as "text = _get_inline_content(m.group("text")).

      Also, the Google spec also says "A sequence of # characters with anything but spaces following it is not a closing sequence, but counts as part of the contents of the heading:" so is it really correct to use "\s*" in that regex, instead of "[ ]*"? And does it matter, since the input was rstrip'ped already?

      So perhaps:

        def _get_inline_content(s: str) -> str:
            s = s.rstrip(" ") # remove trailing spaces
            s = s.rstrip("#") # removing "#" style closers
            return s.strip() # remove leading and trailing whitespace
      
      would be more correct, readable, and maintainable?
      • 100%. That's why if you want good code you need to pay attention to what it's writing and testing and throw feedback like that at it.
        • My points though are

          1) the development isn't actually using red/green TDD, and

          2) the result doesn't show "really good results", including not following a very well-defined specification

          so doesn't work as a concrete example of your description of what the second chapter is supposed to be about.

          Perhaps you could show the process of refining it more, so it actually is spec compliant and tests all the implemented features?

          What's the outcome difference between this approach vs. something which isn't TDD, likes test-after with full branch coverage or mutation testing? Those at least are more automatable than manual inspection, so a better fit to agentic coding, yes?

          (Of course regular branch coverage doesn't test all the regexp branches, which makes regexp use tricky to test.)

          • Yeah I'm going to ditch those examples and find better ones. I was hoping to illustrate the idea as simply as possible but they're not up to scratch.
            • I think the problem-to-solve is a good one. The Google Markdown spec is very clear, with plenty of examples, and I think the problem is well-defined.

              I've seen entirely too many examples of how to use TDD which give under-specified toy problems, where the solution is annoyingly incomplete for something more realistic.

              And I've seen TDD projects which didn't follow the spec, but instead implemented the developers' misconceptions about the spec.

              That's exactly what we see here with Markdown, where there's a spec, along with a lot of non-conformant examples in the training set by people who didn't read the spec but instead based it on their experiences in using Markdown.

              The code generated by ChatGPT is almost correct. Seeing the process of how to get from that to a valid and well-tested solution would make for a good demonstration of the full process.

              I'll again add that showing how to integrate something like branch coverage or hypothesis testing for automatic test suite generation would be really useful.

  • The cost has always been the sum of:

    1. The time spent to think and iteratively understand what you want to build 2. The time spent to spell out how you want to build it

    The cost for #2 is nearly zero now. The cost for #1 too is slashed substantially because instead of thinking in abstract terms or writing tests you can build a version of the thing and then ground your reasoning in that implementation and iterate until you attain the right functionality.

    However, once that thing is complex enough you still need to burn time on identifying the boundaries of the various components and their interplay. There is no gain from building "a browser" and then iterating on the whole thing until it becomes "the browser". You'll be up against combinatorial complexity. You can perhaps deal with that complexity if you have a way to validate every tiny detail, which some are doing very well in porting software for example.

  • I agree, writing code is cheaper than ever ... but "writing the code" isn't the main challenge in SWE since decades.

    Our IT infrastructures & applications run on a stack that has grown for the past 40 years and what we are currently doing with all this LLM & vibecode mania is adding more stuff (at an accelerated rate) on the top of the stack.

    The main challenge today is combining the user requirements with the stack in a way it works, it's easy to maintain and it donesn't cost too much.

  • I am pushing for this judicious Eval [1] Driven Development where tech/non tech users use intent when coding to minimally design some known aspects and try to build simply and using common standards across the team (that should be in the context of the agent) to produce the minimal amount of human readable and clean code that would do the job. The more the building blocks they work with are simple, easy to validate and rely on the proper unit tests, integration tests, data tests the more chance that things can be "one shotted".

    One huge barrier is fighting entropy. You should be wary of prototypes which create false expectations and don't help product evolution whereas tracer bullets [2] might be better if you want to quickly show something and adjust.

    Testing and testability are concepts that aren't intuitive or easy until you develop a feel for them so we should be preaching feeling that pain and moving slowly and with intent and working minimally [3] when you actually want to share or maintain your coding artifact. There should be no difference between judicious human and computer code. Don't suddenly start putting What instead of why in comments or repeating everything.

    Helping non tech people become builders or sharers is a challenge beyond "vibe coding" and the agent skills [4] space is fascinating for that. Like most things AI (LLM), UX matters more than almost anything else.

    [1] https://ai-evals.io

    [2] concept from the Pragmatic Programmer, https://www.aihero.dev/tracer-bullets

    [3] https://alexhans.github.io/posts/series/evals/measure-first-...

    [4] https://alexhans.github.io/posts/series/evals/building-agent...

  • Writing code has always been cheap. Deciding what the logic should be, and being able to change course was the hard bit.
    • But that's the thing - changing course is suddenly no longer hard. We've already reached a state where I can have AI generate a decent set of tests from an existing codebase (or better yet, I'd already have them ahead of time), and to then do a massive refactoring or even a full rewrite while I get a good night's sleep. There is nothing "has always been" about this.
  • Yes writing code is easier than ever, my problem is that understanding it still costs the same if not more [0]. I get that when people use agents, understanding code is not the concern because it's not exactly catering to people, it's for other agents. But when maintaining applications that have been running for years now, I still believe we need to fully understand code before we commit.

    [0]: https://idiallo.com/blog/writing-code-is-easy-reading-is-har...

  • Agentic coding are bringing new people to coding. But instead of reading some books about coding or looking at the history, they face the same problems as before, they have the same struggle and they re-invent the same solutions.

    I am waiting for the vibe coding expert posts that will tell us that lines of code are not a good measure, it is a liability and you should instruct your agent to write less code ...

  • > Good code still has a cost

    > Delivering new code has dropped in price to almost free... but delivering good code remains significantly more expensive than that.

    Writing code was always cheap to start with. Just outsource it to the lowest bidder. Writing good code remains as expensive.

    The same when programmers from different languages are considered. How many Scala/Haskell engineers can I find compared to Java is not the question. It is about how many good engineers you can hire. With Haskell that pool is definitely denser.

  • One of the biggest challenges right now in my opinion is disambiguating what processes _were_ necessary from those that are _still_ necessary and useful in light of exactly this.
    • Precisely, especially because habits have been bound up in the high (and difficult to measure!) cost of code. We got precious about it, really.
    • Not necessary: stand up meetings.
      • If your code output has been devalued then the strategy of lowering your input as a person might not be the best approach.
      • Stand up is where I find out what code I’ll have to fix 6 months from now after my team member finishes ignoring all my advice. Useful to me.
  • Who's going to maintain all the cheap code and review it? Is the "software janitor" job title becoming reality now?
  • AI agents is like outsourcing to a bad team offshore - yeah they can build and maybe cheap but requires lots of hand holding.
    • its going to replace outsourcing to offshore teams pretty robustly though I suspect
  • I see lot of comments downplaying the significance of this but other than very large and/or mission critical infrastructure roles, your "taste and experience" is going to become cheap just like code.

    Currently there is this notion that white collar workers and artists still have which is that they bring "taste" too to the experience but eventually AI will come for those as well, may or may not be LLM, and not sure about timelines.

    Even as we speak, when I read through HN comments, I always ask : "Did an AI write this" or did someone use AI to help write their response. This goes beyond HN but any photo or drawing or music I hear now I ask the same question but eventually nobody will care because we are climbing out of uncanny valley very quickly.

    • Exactly. I feel like the latest models are basically a couple MCP servers away from just doing the whole thing. You just say, here’s what I want the system to do, and it’ll just do everything. No knowledge required. You need only know how to ask.
  • > any time our instinct says "don't build that, it's not worth the time" fire off a prompt anyway

    I disagree with this sentiment. This approach leads down to huge maintenance burden.

    • In context, this bit is about how to deal with the fact that our intuitions on what's worth it vs not worth it in terms of the time it takes to build are likely out-of-date with these new tools:

      > For now I think the best we can do is to second guess ourselves: any time our instinct says "don't build that, it's not worth the time" fire off a prompt anyway, in an asynchronous agent session where the worst that can happen is you check ten minutes later and find that it wasn't worth the tokens.

      This shouldn't lead to huge maintenance burden because most of the time you'll throw away the result. It's a learning exercise.

  • Putting text into a file is cheaper than before. Everything else remains the same cost in a well designed project, rather than a vibe coded one where you just tell the LLM to "make a todo list app"
  • The rule of good fast cheap still applies the same as always, but business leaders consistently choose to ignore this reality and insist upon fast and cheap without acknowledging that it will come at the cost of good.

    What's worse, is that these decisions are usually made on a short-term, quarterly basis. They never consider that slowing down today might save us time and money in the long-term. Better code means less bugs and faster bug-fixes. LLMs only exacerbate the business leader's worst tendencies.

  • I like the idea of we will always need Pilots.

    We have autopilot and i'm sure if we tried could automate take off and landing of commercial flights.

    But we will keep pilots on planes long after they are needed.

    • The Airbus A320neo can already takeoff, ascend, cruise, descend, and land all by autopilot. It can even download your flight plan from the airline's servers.

      But you still need the pilots because the system can only handle the happy path. As soon as there's any blockade or strong weather change, the autopilot will just turn off. And then you need the pilots.

      I would say software engineering with AI is similar: The AI can handle CRUD just fine. But once things get messy, you need someone who can actually think.

    • To fly a plane with 300+ passengers you still only need 2-3 pilots. That has remained consistent with the invention of autopilot. While we might still need a few human engineer experts, maybe we only need a few for small to medium sized companies? That may not eliminate the career for the top % but it effectively does for the vast majority of engineers.
    • We do automate lots about flying, not just take-off and landing. It's why a 4-engine aircraft in the 1960s required flight crews of 6-8 people just to fly the thing when they can be routinely flown with 2-3 today.
    • Autolands absolutely do exist.
  • If writing code is cheap now why is there so much money involved?
  • All writing is cheap now, but good writing is not cheap.

    Same for images, same for videos, probably it's already the same for movies.

  • Let me get this straight so basically with the chatbots everyone is now running faster than ever. Every business thinks they can get an advantage by increasing their velocity relative to their competition. But the competitive is doing the same so what is the outcome?

    It feels like businesses are just going to speedrun their lifecycles faster than ever and useful idiots are doing the work of 10 people while getting paid for 1. The asset owning class obviously win as they can squeeze more profits from smaller amount of workers on the short term.

    It's like everyone going to Rammstein concert where the people at the seats start standing in order to see better. This forces other to stand too, end result everyone is standing, everyone is worse off and nobody sees any better.

  • But writing good code is still not cheap.
    • See my heading half way down the page!
  • Writing code is 20-100 USD per month now
  • its funny how we're back again measuring lines of code as the sole indicator for cost/quality etc
    • See my list of characteristics of "good code" half way down the article for my thoughts on quality that go way beyond lines of code produced.
  • Simon gets alot of.. negative feedback, but this is a good description of the current state and I think he does a good job at distilling and expressing where things are currently at. With all the hate and hype for AI everywhere this is a good thing.
  • "Writing" code is cheap but this just scratches the surface. Its a completely different paradigm. All forms of digital generation is cheap and on the verge of being fully automated which comes with self recursion loops.

    Automated intelligence is now cheap....

  • LLM's have made code cheap in the same way McDonalds has made eating out at a restaurant cheap.
    • I wonder if they will make code fast in the way that McDonalds made food fast. For many business needs, knowing when a project will finish would be equally or even more valuable than knowing that it will contain more code or employ fewer programmers.
  • Time for schools to stop pushing coding skills onto the poor kids.
  • the interesting shift is where the time goes. before: thinking + typing. now: thinking + reviewing. the thinking part didn't get cheaper -- domain knowledge, edge cases, integration constraints -- none of that is free. what changed is you now review AI output instead of type your own, which is genuinely faster but not as different as it sounds. the hard part was always understanding what to build, not the keystrokes.
    • > the interesting shift is where the time goes. before: thinking + typing. now: thinking + reviewing.

      It's widely accepted that you can't learn just by reading, you have to write. So only thinking and reviewing is a great way to lose all the business domain knowledge.

      > the thinking part didn't get cheaper -- domain knowledge, edge cases, integration constraints -- none of that is free. what changed is you now review AI output instead of type your own, which is genuinely faster but not as different as it sounds

      It's very different - you lose business domain knowledge if you're only reading.

  • you'd think this guy would get tired of writing about AI all day
  • (I wrote this mainly from the perspective of what it feels like on the inside to write code as a human. I think https://news.ycombinator.com/item?id=47138965 explains it better in terms of the business aspect.)

    > Code has always been expensive. Producing a few hundred lines of clean, tested code takes most software developers a full day or more. Many of our engineering habits, at both the macro and micro level, are built around this core constraint.

    > At the macro level we spend a great deal of time designing, estimating and planning out projects, to ensure that our expensive coding time is spent as efficiently as possible. Product feature ideas are evaluated in terms of how much value they can provide in exchange for that time - a feature needs to earn its development costs many times over to be worthwhile!

    This doesn't seem quite right.

    "Producing" code involves a lot more than just typing it. It's slow because the dev is planning things out at the same time, at a micro level. Suppose "a few hundred lines" is 10kB of text; actually typing that out at full speed is maybe half an hour of work for a reasonably accomplished typist. But basically nobody even writes blog posts that fast. As soon as you're writing more than a sentence or two and actually care about how you come across (never mind having to satisfy a compiler) you're invoking system 2 thinking. (I didn't even consider tab completion in my napkin math; it wouldn't really matter, because you have to already be thinking to get value out of it!)

    Time spent planning isn't really about saving coding time, because it really can't be. It's not like ultra-cheap code production suddenly lets you just implement all the things and see what happens. Features can be net negative, or even outright harmful to the product. It takes resources to evaluate any unplanned change and then integrate it if accepted. (cf. the recent buzz about FOSS projects getting flooded with AI-generated PRs.) Deciding "this feature will/won't be worth the effort" is a rough guess at best, because the actual code/test loop is where you actually hit the unknown unknowns (presumably LLMs would feel the same way, if they were conscious).

    And perhaps most importantly, simply meeting feature goals isn't enough. This might be slightly less true in a world where your developers (now partly LLM) don't care about the existing coding style or architecture and can readily adapt on the fly. But I'm still convinced that just letting everything accumulate without a clear vision — without oversight and accountability — will always ultimately lead to a tech-debt reckoning.

    Besides which, "a few hundred lines... a full day" is a highly optimistic metric along an axis that has long been understood to make little to no sense. That's a day where things are only added and not removed or changed (to fix them), in a relatively new system (so that interactions with other stuff don't have to be considered). Extrapolating that rate gives you absurd results, e.g. a team of ten developers might re-create the entire Python standard library from scratch in a year.

    Well, that becomes more realistic when you already know exactly everything that you're going to do. But the LLM doesn't know that either. In fact, it doesn't even have the advantage of sharing your team's vision and wanting to see it come to fruition. Ask the LLM: "We're creating a new programming language; what modules should the standard library have?" How helpful is that going to be compared to the (human) new guy?

  • So… XP was the right way all along?
  • Scathophagidae are flies that really like eating shit. We know how to cheaply produce massive amounts of shit.

    But that doesn't mean we solved world hunger. In the same way, AIs churning out millions of lines of code doesn't mean we have solved software engineering.

    Actually, I would argue that high LOCs are a liability, not an asset. We have found a very fast way of turning money into slop, which will then need maintenance and delay every future release. Unless, of course, you have an expert code reviewer who checks the AI output. But in that case, the productivity gains will be max 10%. Because thoroughly reviewing code is almost the same amount of work as writing it.

  • Another person that doesnt seem to understand that the cost of using ai agents is going to skyrocket at some point in the near future.
  • Writing code was always cheap! You could outsource for inconsequential amounts of money and get massive amounts of code in return. Yet, the vast majority of companies do not do so. Because coding is not the hard part of being a software engineer / programmer.

    That's like saying that photography killed painting because it saved you from having to draw things. Drawing is basically free now, I just take the photo. But the number of painters (and by that I mean, artists who paint) is dramatically higher today than in 1800. Artists didn't die because of mechanical reproduction, they flourished, because that wasn't the problem they were solving.

  • Sometimes it feels what we are seeing is Code becoming just like any other "asset" in the globalised economy: cheap - but not quality; just like the priors of clothing (disintegrating after a few washes), consumer electronics (cheap materials), furniture (Instagram-able but utterly impracticable), etc: all made for quick turn-overs to rake in more profit and generate more waste but none made to last long.
  • Understanding computers and programming is not the same as coding.
  • Put another way: “reading code costs the same as it always did” arguably more when you consider that the cost of reading goes down when the ability read goes up. in other words if you wrote the thing it is likely you can read it fast. but reading someone elses stuff is harder.
  • Adding cast iron on airplane is cheap now.
  • On a higher level, this is wishful thinking. However, for people who think this way, getting code to compile at all used to be very "expensive". AI makes abstract code above machine level much easier to compile.

    My benchmark for code being cheap is when AI is able to write machine level code. At that point, yes, code is cheap. Currently, the Dollar Store version of code is available.

  • First there's no "code". There are many different variations. Doing basic UI in React is way different from doing low-level embedded code with locking and mutexes, etc.

    AI is quite good at what I call "code in-painting": you give the outlines, and it fills the boring stuff (writing out UI, writing out the test content, etc)

    It's still VERY bad at maths. For technical reasons (you can't differentiate a SAT solver, so for now LLMs are mostly "hallucinating" plausible-sounding reasonings, but not doing any "solid math") And when you start reasoning about locks / mutexes, etc, what you're really doing is (some primitive form of) maths and logic.

    For now, there's no easy-to-use framework (eg a streamlined way to encode the constraints in Lean / Coq etc) or AI capacities that allow to bridge something like this to make it safe for very "math-like" code. And it's easy to shoot yourself in the foot.

  • - posting on hackernews has also become cheap which is where there are Nx more Show HN posts but engagement on each has also dropped to 1/N
  • I think the framing is still too code-centric.

    The real bottleneck isn’t writing (or even reviewing) code anymore. It’s:

    1. extracting knowledge from domain experts

    2. building a coherent mental model of the domain

    3. making product decisions under ambiguity / tradeoffs

    4. turning that into clear, testable requirements and steering the loop as reality pushes back

    The workflow is shifting to:

    Understand domain => Draft PRD/spec (LLM helps) => Prompt agent to implement => Evaluate against intent + constraints => Refine (requirements + tests + code) => Repeat

    The “typing” part used to dominate the cost structure, so we optimized around it (architecture upfront, DRY everywhere, extreme caution). Now the expensive part is clarity of intent and orchestrating the iteration: deciding what to build next, what to cut, what to validate, what to trust, and where to add guardrails (tests, invariants, observability).

    If your requirements are fuzzy, the agent will happily generate 5k lines of very confident nonsense. If your domain model + constraints are crisp, results can be shockingly good.

    So the scarce skill isn’t “can you write good code?” It’s “can you interrogate reality well enough to produce a precise model—and then continuously steer the agent against that model?”

  • i’ve yet to see an agent that can take a figma design and produce high fidelity UI
    • Have you tried Gemini 3.1 Pro for that yet? They put a ton of work into improving its frontend abilities.
  • maybe writing code is cheap . But writing unit tests that actually test stuff that matters is suddenly so much more important and expensive.
    • Yeah it's more important, but I think that has become a whole lot cheaper too.
  • I'm consistently shocked at how much cope about AI is still in hacker news threads. Clinging to the idea that human taste and skill in writing code is still no match for AI.

    Yes we still need humans in the code loop, but that window is tiny (relative to the lines of code written) and getting smaller quickly.

    A new feature with 1000 lines of code can now be written in 2 minutes, and often work just as desired. Yes, almost always something can be refactored, etc, but it works, and it's good.

    Denying that code is now MUCH cheaper than it was pre-AI is... head in the sand stuff, I think?

  • I feel like this article is completely backwards in its premise. It is confusing labor intensity with cost. Software used to be an incredibly labor intensive industry with extremely low automation costs. The cost of running a computer and its associated development environment is so low that it is a rounding error. While you might pay a monthly subscription for an IDE, even that cost barely even registers and it only assists the developer's productivity by a double digit percentage at most.

    Here is an illustrative example: Copy paste and traditional code generation features in IDEs (automatically generating getters, setters, hashCode, equals implementations and so on) only reduced the typing of boiler plate code with low cognitive load. This type of code was never very labor intensive to begin with, because it can be reduced down to the act of typing. These tools have made writing code cheap decades ago and you could have written a similar blog post about these tools, because the premise fundamentally misses the actual point.

    The cost of writing code has never been an issue in this industry. Software developers don't spend their entire day writing code the same way car mechanics don't spend their day screwing bolts. If you send a car to a mechanic, the mechanic must first diagnose the issue. In some cases the preparatory work is all of the work.

    • > The cost of writing code has never been an issue in this industry.

      What do you think of this part of Paul Ford's recent NYT essay?

      > I was the chief executive of a software services firm, which made me a professional software cost estimator. When I rebooted my messy personal website a few weeks ago, I realized: I would have paid $25,000 for someone else to do this. When a friend asked me to convert a large, thorny data set, I downloaded it, cleaned it up and made it pretty and easy to explore. In the past I would have charged $350,000.

      > That last price is full 2021 retail — it implies a product manager, a designer, two engineers (one senior) and four to six months of design, coding and testing. Plus maintenance. Bespoke software is joltingly expensive. Today, though, when the stars align and my prompts work out, I can do hundreds of thousands of dollars worth of work for fun (fun for me) over weekends and evenings, for the price of the Claude $200-a-month plan.

      From https://www.nytimes.com/2026/02/18/opinion/ai-software.html?...

  • Well another consequence of that is that you’re going to have a lot more tools to maintain.

    And LLMs aren’t half as good as maintaining code as they are to generate it in the first place. At least yet.

  • The interesting thing nobody's talking about here is that cheap code generation actually makes throwaway prototypes viable. Before, you'd agonize over architecture because rewriting was expensive. Now you can build three different approaches in a day and pick the one that works.

    The real cost was never the code itself. It was the decision-making around what to build. That hasn't gotten cheaper at all.

    • I think the prototype thing is absolutely true but breaks down like all prototypes at the level of collaborating, sharing and evolving while handling entropy throug simplicity UNLESS you know what you're doing or the agent steers you with very opinionated tooling customized to your context. I'm thinking about empowering people to be builders and less so a software developer who can make the right tradeoffs.

      Empowering people to work Tracer bullet style after they've selected their prototype of choice and thrown it away might be a powerful pattern that actually gets us into a nice collaborative space.

    • This feels to me like peak sfba mentality on par with "move fast and break things". Outside of trying to create a unicorn, is this really how people create things?

      It seems to me that in order to obtain the ability to build things that other people like, you need to go through the process of creating things they won't. Like a painter needs to paint a bunch of crappy paintings to learn how to create a good painting. If you have the LLM create these throwaway prototypes, how will you even know when you come across a good idea and how will you be able to build it.

      • > It seems to me that in order to obtain the ability to build things that other people like, you need to go through the process of creating things they won't.

        Okay, granted. What does that have to do with how the code is written? Do people generally care if a web app is running from nicely formatted JS or minified JS? Is a product manager not getting better at building things people like because they're not iterating on the code themselves?

        Without agreeing or disagreeing with the premise, I think a relevant metaphor* here is that the painter can practice and iterate and go from creating crappy paintings to creating good paintings, without needing to make their own paint and canvas and brushes. If they're particular, they can have their assistant go to the supply shop and get just the right things they want, with increasing specificity as needed, but they don't need to manufacture them by hand.

        * Like most metaphors, it's not perfect; please try to understand the intent.

        • I agree mostly with your metaphor, I think perhaps I disagree slightly on how it's applied. You don't need to create your own tools to create art, but I don't necessarily map the "tools" to code. The act of programming is mapping information to hardware, the value is in the information, and using LLM's to bypass the phase where you obtain, synthesise, and extend that information is the part where you lose the benefits of iteration. If you're just using the LLM as a mechanical tool to output code, it's mostly not different from, say, using speech-to-text to output code. When you start hearing things like "I don't care about the quality of the code, just it's outputs" that starts sounding like someone isn't iterating on the information which is the crucial bit.
      • This is how successful things are created. By iterating on less successful things until they become successful.

        The cost of iterating (with software) dropped by a few orders of magnitude in the last few months.

        • But you need to actually be the one doing the iterating, you can't outsource it. The entire point to doing the iteration is the process, not the artefacts.
          • I'm finding I can iterate significantly faster if the coding agent is doing the typing for me, and learn at a faster rate as a result.
            • Hmm interesting, I didn't realise people were using it as a typing replacement instead of having it work agentically. Does that mean when you want to change a line of code somewhere, you just prompt the LLM to replace line 334 with your changes etc? So do you not use the LLM autonomously at all then? Sounds like it since you're still doing the iteration yourself.
              • I do both. A lot of changes are "autonomous" like "add a new Django model to record a change every time the title or body is edited in the admin", but I also do more fine grained edits like "have the import script truncate to 400 chars" (instead of 250.)

                Sometimes I'll make edits like 400 to 250 by hand, but if I'm prompting on my phone it's faster to have the model do it as navigating code in an editor and changing it at the exact right point is fiddly on a mobile keyboard - models can spot and account for typos, direct code editing can't.

    • I shit you not, this is an AI generated comment. All recent comments from this profile are AI generated. This is so ironic
  • [dead]
  • [dead]
  • [flagged]
  • [flagged]
    • Out of curiosity, are you the same person who's constantly creating brand new accounts to have a go at me or are there more than one of you?
      • Given the relatively large number of new accounts I've been seeing recently, on all threads not just in response to you, I'm torn between "Hacker News become normal-internet-famous" and "dead internet reached us".

        I scored 0* on a HN-Turing-Test game: https://news.ycombinator.com/item?id=47070537

        * or less, given everything I identified was a false positive

  • For everyone who is responding to the "Writing code is cheap now" heading without reading the article, I'd encourage you to scroll down to the "Good code still has a cost" section.