- We are slowly inching closer to the point where AI and AI products will be billed for what they cost. We are currently living in the heavily discounted world where everything subsidized to the point where a lot of it is free. It seems like they can't or won't keep that up anymore. My prediction is that whenever one of the big companies raise their prices or move features to higher tiers others will follow soon. They all feel the pressure and non of them want to give away more money than they need to.
I wonder if managers will be as excited about AI when the prices go up.
- > We are slowly inching closer to the point where AI and AI products will be billed for what they cost.
I suspect the API prices are already served with profitable unit economics. The SOTA API prices are much higher than the costs for other providers to run very large open weight models.
The monthly subscription plans were being offered at a discount to generate interest in these models.
We're not entering a period of billing AI at cost. We're entering a period of exploring how how the prices can go before losing too many customers.
Products and services aren't sold at cost. They're sold at the price the market will bear. It takes some experimentation to find that equilibrium point where you make more profit per customer but don't lose too many customers.
- > I suspect the API prices are already served at prices with profitable unit economics.
There is absolutely no evidence to support this.
- Some basic math supports it. A GB300 NVL72 is about $6.5 million. Lets say that you need $6 million worth of cooling and another $6 million worth of electricity. At current rates, that's 720 billion tokens worth of Claude Opus 4.7. At 100,000 tokens per second, it pays for itself in about 3 months.
Obviously this is an extremely rough calculation. I can even be off by a factor of 10 and it's still a pretty good return.
- Unless you're serving Chinese open-weight models - you have to consoder training costs. If you're off my 10x, then the amortization period is 30 months - far longer than the useful lifetimes of SoTA models. Frontier model development is a Red Queens race: you have to run as fast as you can, just to maintain your position.
- The discussion was if Anthropic makes money on inference. They do. They lose billions on training.
- No, because Anthropic can't serve their models unless they train them.
Training is akin to the cost of building the software/product. Inference is selling the product.
- > There is absolutely no evidence to support this.
Analysts like Semi-Analysis have done a lot of modeling and estimates on the topic.
But two can play this game: There is absolutely no evidence to support that API prices do not have profitable unit economics.
- I'm not familiar with that analysis, its accuracy, or its evidence. I would be surprised by this given it seems like providers are still in the growth phase.
Typically the burden of proof is on the one making the claim.
- https://semianalysis.com/
They have some of the best publicly available analysis on these topics. The full details and numbers are hidden behind the institutional accounts which are priced for investors (not something you sign up for personally) but they're generous with what they send out in their newsletter.
If you're not familiar with resources like this I could understand how you'd assume that the providers are hemorrhaging money on inference costs, because that is that story that gets parroted around spaces like Hacker News.
You could ignore all of that, though, and go check OpenRouter to see how much providers are selling high parameter count models. They're not entirely at the level of the SOTA models, but the biggest open weight models are not that far behind in complexity either. They're being sold an order of magnitude cheaper than what you pay for the APIs from the major players. We don't know exactly how big the major models are, but it's unlikely that they're more than 10X more compute intensive from the leaks we do have.
- Yeah, so in other words, you don't have anything to back up your claim.
- We don’t know the models sizes, requirements, and optimisations, but we could take a guess using the infrastructure costs of the largest open weight alternatives that perform slightly worse.
In my opinion, it’s a profitable kind of service. They probably don’t pay the public prices for the cloud GPUs though.
- In my opinion it seems like a very unprofitable service propped up by investor money trying to capture market share.
Or, as I would say if I were Bugs Bunny, “Duck Season”
- Rabbit season!
- Just looking at infra cost is not enough. If the token price doesn't contain all the costs they are losing money and they eventually have to raise prices more.
- Mr.Truth-teller Amodei confirmed it that APIs are profitable at Anthropic.
- I don’t think any of the AI model providers have produced any evidence to back their claims of profitability.
I want to see their S-1s, then we can fight.
- He didn't, he talked very carefully in hypotheticals.
- I called this last year: https://digitalseams.com/blog/the-ai-lifestyle-subsidy-is-go... .
I see it as no different from the previous generation of consumer startups burning money - as Derek Thompson wrote,
> ...if you woke up on a Casper mattress, worked out with a Peloton, Ubered to a WeWork, ordered on DoorDash for lunch, took a Lyft home, and ordered dinner through Postmates only to realize your partner had already started on a Blue Apron meal, your household had, in one day, interacted with eight unprofitable companies that collectively lost about $15 billion in one year.
- Everyone called it last year and the year before.
The conversation around AI being cheap now started when ChatGPT launched in 2023
- This is already happening. For new Anthropic enterprise accounts you are billed at api token prices (maybe with a small volume discount). Anthropic makes a profit on those tokens. (Sure, that profit does not cover the model training costs, but that’s a separate issue.) It’s the subscriptions for individuals (e.g. Claude Max) that are still subsidized below cost.
> I wonder if managers will be as excited about AI when the prices go up.
Companies are willing to pay the api pricing. Engineering time is very expensive and AI coding agents actually work now since December and are actually showing measurable productivity gains, finally. It’s a good deal to make (obviously, with caveats: you need to make sure your tokens are going on productive tasks that will actually grow revenue) and anyone who penny-pinches is making a strategic mistake.
- "Engineering time is very expensive"
I always wondered about this statement, like we are generally salaried and there is so many variables that affect how I spend my "time". None of us are machines that can do X work per day and our managers get to slice it as they see fit. Pull a dev off a project they love and throw them onto something they hate and suddenly X is diminished greatly.
I would almost predict that reshaping our workflow to be: "prompt, wait, approve changes." results in losses because it is such a mentally tiring workflow and drills into our brains the desire for the LLM to "just fix it". It is the next level of just moving tickets to completed all day.
- > Sure, that profit does not cover the model training costs, but that’s a separate issue.
I don't think it is. At some point they have to make money and they can't do that if the token cost doesn't include ALL the costs. Someone has to pay for that at some point. And someone has to pay for the subsidized subscribers. So no. API token prices don't reflect the real price. They are still subsidized. Just in a different way.
- > Sure, that profit does not cover the model training costs, but that’s a separate issue
It is? If another company comes out with a better model tomorrow and offers it at the same price Anthropic charges for Opus, they’re going to lose customers fast. They have to keep training to keep selling inference.
Most businesses factor in the cost of making their product into the product’s P&L.
- also, like super mario kart, SOTA models from the rear will be continually released because theyre sunk costs and open weights will advertise for themselves. Also, its clear FOMO is a DDoS attack on any perceived leader because theres no way they dont oversell.
Lastly, theyll realize like every good capitalist, theres more profit in exclusiveness and cutiing out customers.
- > Anthropic makes a profit on those tokens
Citation needed. Anthropic does not have public books
- Their CEO is on record as saying this. You may think he's lying, but that's just your opinion; given the pricing and how it stacks relative to the pricing of inference providers of comparable open source models (who are certainly charging above cost!), I am inclined to believe Anthropic on this.
- Why would you believe a tech CEO who has a vested interest in the untruth but can skirt fiduciary duties by speaking cleverly.
- Maybe because Anthropic are trying to get to an IPO and everything is securities fraud?
If their CEO was just flapping his mouth without any other comparable baseline, it'd probably be different. But as the GP points out, open-weight model providers are charging comparable rates and very likely have positive profit margins. That would imply that with API pricing tokens are sold at above cost.
That cost may well be "inference only", so excludes everything apart from hardware and power. Whether that's enough to cover the enormous training costs and other overheads is a different question.
- He just told you. Because overwhelming public evidence supports the claim. Especially the pricing of open weight model inference. Why do you allow a prejudice to overshadow evidence?
- They may be for now. Problem is that when foundation model pricing goes up, you're paying not just the increase in tokens you consume directly, but also for all tokens you're consuming via vendors as well.
If your company has Figma, Github, and Cursor and they're using the same models you are, your monthly costs with them increase as well. You're exposed N times to the foundation model price increases, where N is the number of times software you directly or indirectly use talks to a frontier model.
- It is already bafflingly expensive. I interviewed at a place recently where they said the average dev was hitting $2k/mo in Claude code costs.
That is no longer a helpful tool... it costs like ~15% of an actual dev.
Even if it is helping, is it actually... making things better or building anything truly important? The issue seems way too nuanced to spend $2k/mo. Not to mention the entire tech industry floats on hype and imaginary goal posts so now what? Devs can hurdle towards those faster and more mindlessly?
- > That is no longer a helpful tool... it costs like ~15% of an actual dev.
The full cost of each employee is more than their salary. The common estimate is 1.4X their salary due to all of the employer-paid taxes, benefits, and other things.
So even $2K/month of token costs would only be around 10% of the cost of a mid-range developer cost.
It doesn't have to increase productivity much to justify the cost.
- Don't forget the organisational overhead. You'll need managers and communication overhead between developers grows superlinear (see Brook's law).
- The arithmetic is a little different in every country because of local rates of pay and taxation but it's worth remembering that in most of the world except for the richer parts of the US developers do not get paid what those US developers have been making in recent years. There are a few exceptions but the norm is several times less even in major economies in Europe or Asia.
Another challenge for US tech companies is that - if you'll forgive the bluntness - their "brand" is now toxic in most of the world. Almost everyone is trying to distance themselves from US tech as fast as they can. Governments and big businesses are starting to invest seriously in alternative solutions and local resources. It will happen over time but I don't see much the US tech companies or the US government can do to stop the train now the wheels are turning.
So there's a serious risk for US tech companies now of a double whammy where their already relatively high R&D costs increase even further and yet they're also facing much stronger competition in international markets or maybe even excluded from some of those markets entirely.
If we also reach the seemingly inevitable point that "capable enough" LLMs can run locally - or at least as a private resource provided internally by large organisations - there is very little moat left to protect not only US Big Tech whose stocks have been heavily driven by expected returns from AI but the whole US tech industry that is banking on productivity gains from that AI tech. Then they also won't be able to capture most of the entire global supply of components like GPUs/RAM/SSDs because it won't be cost effective any more - and that is one of the few practical moats they have built (however accidentally) that would be a significant barrier to direct competitors setting up shop in places like Europe and Asia.
It's going to be interesting to see how US tech companies respond to these effects over the next 5-10 years. The giants are all aboard the AI train and can't back down now so there will probably be some casualties there if - as again seems inevitable - the bubble bursts at some point. But then there's a very long tail of still very successful US tech companies that might be paying US salaries and using AI-based tools but aren't themselves focussed on developing or providing those AI-based tools and they're the ones who are going to need to find new ways to compete effectively within that kind of time frame.
- The 1.4x multiple doesn’t work when you get to engineering salaries.
- The real thing is that it is on tap. If I have to engage an Indian outsourcer to hand off easy stuff to it’s going to be much more painful.
- Some people are claiming to use a billion tokens per day. According to Claude's API pricing page that'll cost somewhere between $3k and $10k per day. Leaving aside whether we trust Claude's API pricing model will remain constant into the future, it's abundantly clear that developer is not generating tens of millions of dollars per year in value.
- Slowly inching? GitHub Copilot announced 600%+ price increases for many workflows, with others being potentially 100x more expensive due to the change from request to token based billing.
- With how intransparent everything in the AI world is I have no idea how far away we are from the "real" prices. Might be a gallop or maybe it's much worse than we think and we are still just inching. We'll see. Exciting times. If you like to see the world burn.
- That's what the cloud AI industry is banking on. Pushing hard to get AI into workflows at a critical position, then raising the cost to turn the screws, hoping that companies would rather pay than pivot again
- It's humorous to me that I can do the work of an AI with nothing but a coffee and an occasional sandwich and yet they talk about AI as if it's some sort of magic hack to productivity.
What they don't like is paying money for the work, that's all that matters to them.
- You work for sandwiches and coffee, and not for a decent salary?
- I get paid a decent salary for my expertise and this wild thing called remembering what happened yesterday. My point is my compute is powered by bananas and coffee, not gigawatts of energy and all of the RAM.
- From the POV of your employer your compute is powered by your decent salary, not bananas and coffee. If that goes away you’d be a fool to keep pointing your brain at your employer’s problems.
Thus, your compute is significantly more expensive than AI. Thankfully your taste is also part of your package deal, and is where you deliver real the value over an LLM.
- run local models
- You need massively expensive hardware to run them, and they aren't as good. It's pretty clear the base price of AI tools is way higher than we are being charged right now.
- I wouldn't call my $2k Strix Halo computer "massively expensive", and it runs e.g. Qwen 3.6 27b brilliantly, with tons of memory to spare and is a full x86 powerhouse pulling 120w at absolute max.
IMO the programming world is far too myopic about / insistent on using laptops, especially macbooks. Just because a crappy deal exists doesn't mean everyone is forced to take it. Local AI is a high performance computing problem and laptops are fundamentally a crappy form factor for it; buy an efficient desktop computer and be surprised at what's possible even with today's crazy prices.
- But with new hardware comping out, and maybe models being smart enough to help with optimizing them and reducing inference costs even more, I think we should still expect the costs to go down.
- This is funny, but Copilot is still an interesting case-study and (probably) failed predictor of where we are headed.
We all know, and have known for a long time, that the AI labs selling dollars for a nickel are going to pull that rug, and up that price, at some point.
Copilot, though, has been consistently the weakest mainstream AI coding offering. Inferior to Cursor or Windsurf at editor completions, inferior to Codex, Claude, OpenCode, blah blah blah, at agentic coding and also the old-school chat-style...
And now, it's no longer cheap AND now sucks even more than it has all along — the new $39/month plan is not only worse than all its competitors, but worse than its own $10 plan was a month ago — by a lot.
The thing is, you can't jack the price up unless you're good enough — at least on some axis, to some customer segment — to jack the price. And when you're not good enough, and you have vastly superior competitors who are not doing that yet... you're just forfeiting the game.
Which I agree, Copilot should do — it's the Windows Phone of AI coding assistants, after all — it still seems weird to me to just commit humiliating suicide rather that trying to make some deal with one of those superior competitors.
Instead of just jumping into a dumpster and lighting yourself on fire.
- I suspect Microsoft will reneg if enough people cancel.
Even before yesterday, I assumed they made money via the gym model. I'd have months where I'm too busy to use my co pilot subscription in any meaningful way.
Canceling and restarting is too much of a hassle.
But with the pricing update I'd probably use up the 10$ plan within 3 days.
I don't know if anything else is integrated so we'll into GitHub though. I might keep the 10$ plan just for the occasional GitHub AI PR.
- If their pricing turns out to be what they claim, and copilot cli has accurate token counts, they had the best deal around.
Just today, when I wasn't being especially chatty with GHCP, I used about 12 requests to get a few thousand line changes in 3 projects I'm juggling. The last project repo of copilot I closed, in 3 hours burned 38M input tokens, 28M cache, and like 400K out. For GPT5.4, high. That's like $135, in half the day, 1 of 3 instances. No crazy tool use, just lots of docs and unorganized code. GHCP charged like 70 cents for that on the old plan.
- > it's the Windows Phone of AI coding assistants, after all
It seems everything Microsoft does is like this nowadays. They just can't seem to win anymore.
- Microslop has lost their way from their ole acquisition investments and have instead hedged a bet on vibing their way into other industries.
- It's at least considerate of them to jump into the dumpster first, less of a mess to deal with.
- They should also remove Copilot code reviews from being counted as metrics in a PR.
I've seen some projects that use it and you open the PR page to be greeted by every PR having 3-20 comments but when you goto the actual PR, there's no one except the contributor with a bunch of Copilot feedback.
It gives a false message that the PR is resonating with folks and has real activity. I wonder if GitHub did this on purpose to make engagement seem higher than it really is.
- If I had a magic wand, I would enforce an internet-wide separation of human activity metrics from bots.
I want to know how many real humans read my post, commented, shared etc.
Clankers can keep their own counts.
- Reddit is slowly dying for me for that reason. So many bot / bot like accounts that seem ... off / hidden histories. Trust level with any given comment or post now is reaching 0 fast.
It's a bummer because it's hitting a lot of users and even valid users who don't communicate good are getting hit hard too with skeptical responses.
- Yep, allowing users to hide history has made it straightforward for bots to exist unchallenged.
Previously a quick scan of comment history would make it obvious you're looking at an LLM, now you're stuck arguing over a one off comment where they can get away with benefit of the doubt.
- The irony of Reddit's early days was that it was bootstrapped with fake accounts run by the founder.
- Reddit has always been fake, but it used to be a real person performing creative writing pretending to be a true story. Now it's spammed out slop at scale.
- Where I work that many comments could be taken as a bad thing: "i've seen too many comments finding issues or nitpicks with your PR, why aren't you doing a better job before submitting it for review??"
- scary but thank goodness github actions is highly reliable, robust to change, and has a simple-to-understand ontology.
- To the naysayers, I would point out that actions has not only one but TWO 9s of uptime. [1]
- Github has recently changed the way their status page tracks uptime in the name of "transparency"[0]. "Partial Outages" are now only worth 30% of their duration, and "Degraded Performance" is worth none, so their uptime values are now wildly inflated.
[0] https://github.blog/news-insights/company-news/bringing-more...
- Of course they hide behind corporate speak like "Bringing more transparency".
- Sometimes both nines are even in the front!
- Cursed by mighty Redmond to roam the market wasteland until death, one of the seventy some odd beleaguered CoPilot products is now being lashed like a haggard burro to the dying light of a once prominent development platform that, upon itself, were pinned the hopes and dreams of a commercial software juggernaut to capture the hearts and minds of developers all around the world.
- DEVELOPERS DEVELOPERS DEVELOPERS DEVELOPERS
- I can smell this comment.
- https://isgithubcooked.com/?services=actions
It seems that for "actions", the trailing twelve months availability is 98.67%.
Trailing 3 months is even worse :/
- Yesterday's ≈5h long incident with the Pull Requests page being blank is listed as 1h 47m.
My org noticed the incident at 12:19p ET, Github pushed their first update at 12:38p, and pushed that it was mitigated at 5:48p.
- …And at this point probably not one but two 9s (or possibly more) of security vulnerabilities [1]
[1] https://securitylab.github.com/resources/github-actions-prev...
- For all the hate it gets, I use them regularly with little to no complaints.
I have always found it as a pretty nice to have feature if I am already using GitHub. It’s far from perfect or robust but I can get a lot of use out of it with low to no friction.
- We had more build failures in 2025 due to Actions outages or degraded service than any other reason.
- Which is fair but inversely we do many builds throughout the day most business days and have not had an impact where we noticed it. Could also be that we deploy often and frequently and have setup our builds to be as quick as possible so any issues would likely go unnoticed.
- Yeah totally. We use GitHub + actions for backend, and self host Perforce + TeamCity for our main game codebase. We had 0 downtime in 12 months on our TC as a comparison. I know there’s a difference between running an internal service and developing a global scale platform that is abused, but as a user I don’t care about those concerns, I care about the platform being up!
- "You can't possibly hope to have as many resources caring about uptime as $provider, you should outsource to them"
"Give $provider a break, they have such crazy scale that they can't possibly hope to have great uptime"
... yet it very rapidly gets lower uptime than a service running on a desktop in the corner of the office with some backups that get restored somewhere else.
Most sysadmins will tell you tales of laptops with a decade of uptime hosting simple services that nobody cared about (IRC, ticket software) with no downtime, not even an hour, and people only discovered that fact when they decided it was too slow and it's time to migrate.. These services have become less reliable than that, and servers themselves have only gotten more reliable in that time..
(yes, I'm aware of the security liability of decades old software running, even if it's not accessible by everyone)
There's a weird doublethink going on.
- And now it has an outage https://www.githubstatus.com/incidents/dbypmw7h77l5
- As financial markets get tighter AI companies will stop subsidizing their services and charge enough money to actually make a profit.
It is time to setup local models. It is cheaper, and you already have a computer. Why keep it idle and pay someone else for their CPU?
- Because it doesn't even come close to frontier models in intelligence/speed/price. I can run my 3090 nonstop and rack up an electricity bill that costs more than a subscription and get worse results that are slower. They are ok for simple/non complex things, but that's not really what I need AI for.
- I feel the opposite. I do need AI for simple things. Complex things are usually so ill-defined that the actual bottleneck takes place in meatspace, not in my IDE.
- Well, it is currently cheaper because it is massively subsidized. That will change when subsidies stop. I don’t think it is a good argument.
- The claim was "It is cheaper", not "It will be cheaper". Until it actually _is_ cheaper, it doesn't make much sense to purchase $10k+ in hardware to run local models that are still worse than the frontier offerings.
- > Until it actually _is_ cheaper, it doesn't make much sense to purchase
Once it is cheaper, there will be more demand so it will no longer be cheaper. Buying now gets current prices (though demand is still fairly high).
- [dead]
- if only there was a place that was naturally cold to take advantage of airflow for cooling and cheap renewable electricity thats always on...
- are you saying aluminum smelters are going to convert to ai datacenters?
- I assume because local models are nowhere near as good. Hoping I’m wrong!
- The better your code is architected, the less powerful model you’ll need for it to make sense of it.
E.g. a well-designed deployment (infrastructure-as-code) repository doesn’t need a frontier model to be understood well-enough to create a new job / service using sibling jobs / services as templates.
And this already saves me dozens of minutes per week, although it’s not a 2x multiplier in my efficiency.
- I disagree, even though I'd love for it to be different. With models like Opus, I can give it a good architecture and expect good results. For many of the less expensive models, that is not the case, they make mistakes, you need to over specify, they get stuck in a loop, etc. As you get to the models you can realistically run locally, it gets so frustrating I'd rather be writing the code myself.
- At what point will local inference catch up to today’s cloud inference? Will it ever? If it doesn’t, does that imply a certain dead-end for the LLM inference industry?
- The issue is that local models are dumb and tend to make mistakes than look good at a first glance. So any "saving" is quickly ruined by having to do an extensive review. You might as well just write things yourself.
- I use it as code scaffolding, which means in a way I’m often rewriting it. For me, writing from scratch isn’t the same amount of effort as using a code scaffolding tool.
- Won't competition likely keep prices low? At first maybe not, but sooner or later open models will catch up, then it's a completely open market for anyone to host and sell services.
- Competition won't keep prices below cost. Only subsidy by investors can do that and they won't do that forever.
- > Won't competition likely keep prices low?
If there are 100 companies you can choose from, yes.
If there are 3 oligarchs that own all options, not.
Capitalism only works when there is competition between many players. When you get less than a dozen players the prices are too easy to increase to maximize profit. They do not need even to talk between them, to not start a price war is the only logical strategy and they follow it. That is why big-tech is so problematic.
In the past, this kind of companies were highly regulated. Phone companies were not allowed to wiretap calls, prices were limited by law, etc. Internet providers had the same regulations applied to them. But service providers run amok without control abusing their position and hurting the rest of the economy. Do not expect lower prices in such unregulated environment.
- Local models are nowhere near the performance of frontier models. Unless you can fork out like £100k to get something passable in terms of performance.
- > As financial markets get tighter ...
They never really get tight very long: the various states are way too busy flooding the world with endless money printing to kick the can of the public debt always further.
Covid financial crash? We went to new highs. 2022 tech flash crash (Meta and Netflix did -75% for example): we then went to new highs.
The only way for governments who ever spend way more than they bring in taxpayer dollars is to de-valuate the currency.
So "financial markets getting tighter": probably won't last.
- As you said yourself: Quantitative easing did not solve anything. We keep kicking the can down the road, and the problems grow exponentially every time. This approach won’t work forever. In fact, we may be past the tipping point already.
- I feel the rug under my feet moving. Is it being pulled?!
- Just you wait, this is a gentle tug to test how hard they can pull when the time comes.
- It's gonna be interesting to see how this plays out. Usually a tech rugpull like this lasts a number of years. And this sort of has, but the agentic stuff has really only caught on like wildfire in the last, I dunno, six months or so. The rugpull would be way more effective if there could be several years of getting developers addicted to this development paradigm, but alas, the VC money burned was too great to subsidize for very long.
- Its also a weird way to cede control of the market to the foreign model vendors, because I'm reasonably sure that DeepSeek et al aren't subsidising tokens to the same extent that the big 3's subscription models have been.
- I think there's "sort of" a moat for non-Chinese vendors. As much as people distrust the US right now, I think deep down inside everyone knows that the second you let a Chinese provider do inference on your codebase they're gonna suck up every bit of it. But hey, cheap tokens, right?
So you'll probably never see government customers allow that and neither will a lot of commercial customers.
- Why do we assume us providers aren't doing the same? Also all the Chinese providers are giving open weight models. Many you can run locally.
I don't see the risk. If your code is easily AI generated you don't have a moat anyways. A Chinese competitor probably won't have as easy of a time as a US one of you operate in the US
- The US has robust IP and trademark law that allows companies some amount of chance to find a legal remedy to anyone who clones their business. China is notorious for protecting local companies from foreign IP suits.
Further, at a lot of companies, the risk has to be acceptable to shareholders and auditors. Perceived risk is often a more powerful motivator than actual risk.
- > robust IP and trademark law
lmao tell that to the artists, authors and foss contributors whose work has been cloned into the llm oracle
- I don't think that's as true as you think it is.
I mean, "Copyright Infringement" famously does not translate to Mandarin; but we have Amazon ripping off best sellers in their marketplace pretty brazenly and Apple "sherlocking" applications -- that's even where the term comes from.
The models themselves are trained on a corpus of material that was obtained with dubious legality... though I suppose some argument could be made that they're forced to bend the models because of that.
I'd be more wary of these models terms and conditions granting a license to themselves for everything they come into contact with: nobody is reading these licenses it seems. Copilots old one only allowed for "being inspired" by the output, despite they themselves producing an IDE of some kind which allowed you to make complete projects from suggestions: directly breaking their own T&C's.
- Once your code, images, etc pass into the slop machine it is owned by whoever generated it later. Obviously they would need a new logo, llc, and some ui theme tweaks. Otherwise none of these AI coder products would exist.
Also, how long do you think openai, Microsoft, Google, anthropic, etc could delay a lawsuit while you pay hundreds of thousands in legal retainers? 5 years? 10?
- > As much as people distrust the US right now
From the perspective of someone currently living in the EU... I'd say thats pretty much a wash (or even slightly tilted in China's favour) for folks outside the US
- Fortunately there’s plenty of open weight models that are just safetensors and you have a wide variety of providers to choose from, as well as just hosting it yourself.
- Times would have been much more 'interesting' - for better or worse - had the LLM movement occurred during the zero interest-rate era.
- I wonder if it has anything to do with the war in the Middle East forcing gulf states to scale back investment
- If that is true, we’ve discovered that offering a product for $1 the $17, yields to dramatically shorter runway but possibly more addicted users. Can’t wait for products offered at $1 the $100.
- making you pay the actual price of a product is a rug pull now?
- Maybe, but this is also a company whose parent organization is worth a trillion+ dollars that have monopolies in multiple verticals. I know American corpos hate the idea of protecting the commons and would rather fuck it raw to fulfill their carnal urge for profit, but you know there are some public goods and services worth fulfilling for a highly productive industry.
Maybe the government should nationalize GitHub at this point as it is absolutely critical for US infrastructure and MSFT has shown to be a terrible steward for the public.
- wow, gotta say I was not expecting that response lol. So good on you for surprising me I guess.
- It is if you start by charging significantly less than the actual price to get people hooked
- It's more like a bait and switch I guess.
- Not yet... I calculated they under charge by 50-90X soooo
- Github was already struggling with bazillions of throw-as-much-crap-on-the-wall software running in actions, and now the world is running throw-as-much-LLM-crap-on-the-wall computation, as unstoppable as the pre-LLM era. Turning compute into excrement as fast as the planet is filled with it. Excrement being "Github Copilot code review" in compute world, and no need to draw what it is in our real world.
Weird that Anthropic decided to build a Claude Code Routines toilet.
- This seems nonsensical. Why would non-actions activity consume actions budget?
They say that they’re now billing against their actual costs> Last month, we shared how GitHub Copilot code review runs on […] GitHub Actions using GitHub-hosted runners.- Copilot Code Reviews are Actions workflows. Just privileged ones you can't edit the YAML for. They even litter your Actions tab list and Deployment environments.
- My guess is that they're moving to a spot where they can pitch an LLM "doing something" as an action, and copilot is their first move. I don't see it as crazy to think of a "copilot code review" in a similar way to other build actions.
But also - enterprise accounts already have budget assigned to github actions, and this allows them to start billing right away without having to actually get (or allow) businesses to evaluate the return of having copilot do code reviews.
So seems like it's a mix of immediate incentives and long term architecture. I don't like it, though. If I were an enterprise my first response would be to turn it off.
- > enterprise accounts already have budget assigned to github actions, and this allows them to start billing right away without having to actually get (or allow) businesses to evaluate the return of having copilot do code reviews
Hang on, I read this as copilot reviews with bill both actions minutes and AI credits. Did I miss something?
- I'm assuming the running of the model is consuming the tokens, and the client coordinating and orchestrating the calls to the model to perform the review is happening in an action runner, thus using action minutes.
- Agreed, especially weird since they just rolled out usage-based billing for Co-pilot. It would make a lot more sense to just re-use that usage instead IMHO
- It does, read the article. This feature now consumes actions credits and AI credits.
- Yeah my mistake, I wasn't very clear in my comment.
Though actually the more I think about it, I think this change actually does make more sense. In the case of the AI running on GitHub side, that does feel pretty equivalent to CI minutes. I would hope that the number of minutes they bill for is pretty minimal though, since the vast majority of that will be I/O waiting on the agent to return
- Code review ostensibly takes place inside a container runtime just like tests or other actions would. It makes sense to me.
- but consumes vastly more resources than most app's build process.
Done that way it obfuscates cost of the code review and I think that's on purpose
- The cost of running a container (to github) is someone else not being able to run a container.
- [dead]
- “F*ck you. Pay me.”
That’s why.
- Good thing GitHub has plenty of built-up goodwill to spend down. If they didn’t this cascade of (probably necessary but nevertheless negative for customers) changes might be the tipping point to push a lot of companies to seek other options.
- > Good thing GitHub has plenty of built-up goodwill to spend down
Do they, though? I don't know a single person who uses GitHub who actually likes it. It's far more often something like "it's fine, but I miss (GitLab|Gerrit)" or "I stopped using it for personal stuff and moved to (Codeberg|GitLab)."
The brand recognition among non-technical folks is really the strongest selling point in my eyes. And that's irrelevant to ~95% of software development.
- GitLab is getting ensh*ttified as well. Rarely a day passes when they’re not trying to somehow push their AI features on me, even though I never asked for it. Thinking about moving to managed Forgejo.
- The tell-tale just lit up on your sarcasm detector, better get it serviced.
- GitHub has already lost so much good will. It was one of the best pace on the internet for like a decade and is now just a real pain in the ass to interact with
- They burned through their goodwill years ago and are insistent on reminding us all that they are now exactly as shitty as Microsoft because they it’s all Microslop worst customer practices.
- I started moving my repos off GitHub weeks ago, but I'm still waiting for a "good" GitHub competitor to appear. GitLab sucks (if you're not a company and like unnecessary complexity), Codeberg is slow and limited (and has weird mods), Sourcehut UX is weird (and being DDoS'd), Gitea Cloud didn't even have a working login page last time I tried, BitBucket isn't the worst but it has quite a few problems (& isn't set up for public repos/search). Please can somebody start a simple, reliable hosted GitHub alternative? I'd pay for it...
- If you want collaboration for your team, then a small vm with forgejo (if you need PR) is enough. It can be behind a vpn if you do not want to bother with securing it against the whole internet.
If you want to make your repos public, you could use cgit and the like.
- is there any data on how many Actions minutes a single copilot review actually takes? the announcement doesn't mention it, and for a team doing 20+ PRs a day that number adds up fast.
- Double billing for minutes and tokens is intentional obfuscation. You can't trace actual cost.
- Wonder if folks will flock to competitors that still over fixed-price review packages only for those companies to switch to usage-based pricing. They're next to be overrun with new customers and sit on the increasing bills.
- By fall most execs will be asking about the new spend and GitHub alternatives will be researched.
Between 27x model costs and this, CVE exploits and downtime their platform is starting to feel like a questionable decision.
- Interesting, I didn't know minutes where free before.
Stopped my recurring subscription at the end of last year when it started spinning up actions for review. Which as a side effect doubled the time (or so) to do a review. Whereas before that I would open a PR, wait at most a minute or two and the review was already done.
- Unclear why this is so shocking. Sounds like they have been making migrations on their underlying systems and this better aligns with the cost to run. I would be curious how many are using their code review system.
- I'd want to see more about the failure modes. Production systems need graceful degradation more than optimal performance.
- Expect to see more of these kinds of announcements as companies need to start showing returns on their AI investments. It's hard to say how subsidized the current AI products are[1] but we're definitely getting a free lunch at VC's expense the moment.
[1] Ed Zitron speculates the actual prices with token based billing for heavy users will be something like 10x the subscription price, but this seems high.
- Not that I give much credence to anything Zitron says, but the amount of inference you can get on a £200 a month OpenAI or Anthropic subscription is easily an order of magnitude more than what you'd get paying the same amount at subscription rate.
Although I would also point out that OpenAI recently tripled the amount of Codex inference you get per month for £200 (and to head off the suggestion, this is distinct from their current 2x promotion on £100/month plans)
- > Not that I give much credence to anything Zitron says, but the amount of inference you can get on a £200 a month OpenAI or Anthropic subscription is easily an order of magnitude more than what you'd get paying the same amount at subscription rate.
Neither of those is how much it actually costs the company selling the service. And I have feeling they are running at loss here so the play is "get everything possible using LLMs then jack up the pricing"
- There have been plenty of studies which indicate that inference considered by itself is almost certainly quite profitable at all the frontier labs. The problem is amortizing the cost of all the expensive training runs required to train new models into the revenue stream.
- Does that mean those running the open models are highly profitable since they don't have to do any training?
- I don’t know about highly since they have no moat even more than Antrhropic and OpenAI have no moat. Anyone with a few hundred thousand dollars or sufficient free GPUs can compete with them. So running an open model should earn a market-rate margin.
- Yes obviously, otherwise they wouldn't be doing it; they'd just go back to mining shitcoins.
- Yeah, I'm sure the numbers are a bit inflated compared to API, but with my Claude $200/month subscription I've supposedly consumed 12,160,410,828 tokens in April for a cost of $22,733.03.
- *more than what you'd get paying the same amount at usage rate.
- Yes, thanks. Too late to edit now, sadly.
- > 10x the subscription price, but this seems high
Inference is cheap but training is quite expensive. Plus all the money they've invested and keep investing on hardware, data centers, etc. And evidently they also need to make a profit at some point.
- > Inference is cheap
Maybe from the perspective of traditional, turn-based chat. But when you start having developers command an army of agents that work around the clock, those cheap tokens start adding up fast...
- If the unit-economics work out and they can sell $0.99 of tokens for $1.00, doesn't matter how many agents you spin up. The flat rate subscriptions can't last though.
- > If the unit-economics work out and they can sell $0.99 of tokens for $1.00
I think the margins have to be a lot higher than that in order to give investors the return they're expecting, to continue the never-ending training treadmill, and to build more and more datacenters to accommodate people basically DDOS'ing the GPUs in order to run their workloads.
Yes, in theory what you said makes sense. But the tightrope these companies have to walk is that the per-token costs still have to be low enough that developers and companies don't just say "ehhh I guess we can still do all this work the old-fashioned way" but ALSO high enough to cover the massive expenses AND astronomical returns everyone's expecting.
- VC investment isn’t about margins, it’s about finding a unicorn. It doesn’t matter if margins are negative if your product is dominant in the market as you can fiddle with the margins after the fact. You just need to be invested long enough to see everyone else fail.
- The problem with AI is that there doesn't seem to be a durable barrier to entry for a "winner take all" dynamic to work. The biggest barrier to entry seems to be the capital needed to train the models, but even free models are getting "good enough" for some uses and there's little friction to stop users from switching between models. Many frontends make this explicit by letting you pick the model you want to run inside the same environment.
If prices go up, I suspect a bunch of folks will jump to cheaper, less capable models instead of eating the added cost. The whole value proposition of AI in enterprise is around cost-cutting, so that mentality is likely to persist when choosing which model to pay for.
- I imagine the calculus changes a little bit when you've invested hundreds of billions (trillions?) of dollars in a relatively short period of time. Priority number one is probably getting that money back. I think the fact that providers are RAPIDLY cutting back/jacking up prices points to this being the case.
- Anyone have good alternatives for ci/cd on the cheap for a solo dev?
I’m blowing through my 1000 mins in days.
Thinking to either pool some free tiers or figure something out with spot instances.
Also is it just me or is CI/CD tooling still sort of rough all around.
- Your best bet is to self host woodpecker CI.
Hetzner has cheap VPS that I host my CI on. It costs like $10/month.
Pick the cheapest region, since CI runners location doesn’t matter much.
- Yeah, I did hertzner runner for a bit.
But I think the issue is that my situation (solo dev, mono repo) is just not right for a dedicated instance.
With only 1-2 runners, the pipeline is slow (low parallelism) and resource constrained. And at least 50% of the time its idle (I'm not working/sleeping).
I guess what I'm really looking for is for some kind of aggressive autoscaling, and aggressive caching.
I tried a couple of things (GHA, Dagger + Hertzner, Buildkite)
And Im just not too sure theres going to be any out of the box solution since my priority is essentially to minimize cost and maximize efficiency. Not really a great customer for any providers.
Im tempted to just get agent to build something out quickly with cloudflare workers + spot instances.
I also have some other nice to have requirements:
- ts/code over config
- locally runnable and testable
- preferably no lock in
- repeatable/reproducible
- Not sure of what current prices look like but an old desktop sitting on the floor of your office might work well for you. You would need decent internet but running a single node kubernetes cluster as a GitHub action runner has worked well for others I know.
A buddy of mine runs his whole CICD setup off an old gaming desktop. They use tailscale to connect to their hosted infrastructure and set it up as a GitHub action runner.
For a solo dev this might be the way to go.
- Yeah, I’ve been thinking in this direction as well.
My wife uses my old gaming desktop for her ux design work as well.
And I was thinking of using the gpu to run some tts models.
Now to just figure out a way to run it all on windows and have it auto start when she logs in.
- Ephemeral, beefy fly.io instances?
- > And at least 50% of the time its idle (I'm not working/sleeping).
So what? its $10 a month. Why do you need to chase 100% utilization?
And use can use that to host your website, a game server, maybe some other projects...
- I have Gitea and Gitea Runner apps running on TrueNAS on an extra mini PC in a closet. Works better than GitHub for me.
- I self-host Drone CI still. I think Harness is in the slow process of letting it rot (it still gets at least some updates, though), which is kind of a shame, but it still does just fine for my CI needs (solo usage as well).
https://docs.drone.io/server/provider/github/
Very easy to stand up, does just fine. Definitely doesn't have the "library" of prebuilt actions that GHA does, but for the most part... I consider that a plus.
Otherwise it's very similar in concept - define actions in a yaml file, run commands on an image, webhook integration with most repo providers.
I run it on some old hardware locally (k3s cluster on old machines) and it outperforms the 1000 minutes from GHA easily, and costs basically nothing but some maintenance and time.
I've been keeping my eyes open for something new in this space since Harness bought it, though - so if other folks have recommendations I'd be interested in alternatives.
- We use jayporeci.in and have never hit capacity problems since we can choose to run it in laptops / servers / spare VMs etc.
Best decision we ever made
- There is a free tier for on-prem TeamCity.
- Host your own GitLab runner?
- [flagged]
- [flagged]
- SlopHub if anything, their parent corporation is Microslop so it fits
- [flagged]
- [flagged]
- You didn’t bring down GitHub. You brought down the thread.
There’s a difference between criticizing a company and just swapping names for insults. The former can be useful, the latter just turns the discussion into noise. If you’ve got a point about Copilot or the review feature, make it. Otherwise it’s hard to see what anyone is supposed to take away from “ShitHub” other than childish shit-posting.
- I really don't understand your point. You're saying we're turning discussions into noise and continuing to spam comments that are not even related to topic. Hard to take this seriously other than just bullying because you get triggered by a word. I concede there's no point replying to you anymore although I tried to reason about your meaness.
- [dead]