- > We've been running Code Review internally for months: on large PRs (over 1,000 lines changed), 84% get findings, averaging 7.5 issues. On small PRs under 50 lines, that drops to 31%, averaging 0.5 issues. Engineers largely agree with what it surfaces: less than 1% of findings are marked incorrect.
So the take would be that 84% heavily Claude driven PRs are riddled with ~7.5 issues worthy bugs.
Not a great ad of agent based development quality.
- Interesting: "Reviews are billed on token usage and generally average $15–25, scaling with PR size and complexity."
- This cost seems wild. For comparison GitHub Copilot Code Review is four cents per review once you're outside of the credits included with your subscription.
- Same thoughts.
For comparison, Greptile charges $30 per month for 50 reviews, with $1 per additional review.
At average of $15~25 per review, this is way more expensive.
- Average _per review_? Insane costs, that's potentially thousands per developer. Am I missing something?
- I haven't used it so just spit balling, but surely it depends on the quality of the review? If it picks up lots of issues and prevents downtime then it could work out as worthwhile. What would it cost an engineer with deep knowledge of the codebase to do a similar job? You could spend an hour really digging into a PR, poking around, testing stuff out etc. Im guessing most engineers are paid more than $15-25/hr, not to mention the opportunity cost.
- At those prices I wonder if it also reviews the design for ineffectiveness in performance or decomposition into maintainable units besides catching the bugs.
Also the examples are weird IMO. Unless it was an edge/corner case the authentication bug would be caught in even a smoke test. And for the ZFS encryption refactor I'd expect a static-typed language to catch type errors unless they're casting from `void*` or something. Seems like they picked examples by how important/newsworthy the areas were than the technicality of the finds.
- Wait, what? So if I'm a paying Max user, i'd still have to pay more? Don't see the value. Would rather have a repo skill to do the code review with existing Claude Max tokens.
- Does AI review of AI generated code even make sense?
- > Reviews are billed on token usage and generally average $15–25, scaling with PR size and complexity.
You've got to be completely insane to use AI coding tools at this point.
This is the subsidised cost to get users to use it, it could trivially end up ten times this amount. Plus, you've got the ultimate perverse incentive where the company that is selling you the model time to create the PRs is also selling you the review of the same PR.
- what are the implications for the tens of code review platforms that have recently raised on sky high valuations?
- Same as all the other companies that built on top of the API and then were obsoleted after the API provider made it a built-in feature.
https://finance.yahoo.com/news/claude-just-killed-startup-sf...
- bitter lesson applied to platforms
- I'm guessing people need to quickly realize Claude is a platform.
- nice but why is this not a system prompt? what's the value add here?
- You're paying the same token rate for this as you would if it was just a system prompt. Clearly the scaffolding adds something.
(They mention their github action which seems more like a system prompt)
- seems like a very small value add. why is this a blog post - i could do this myself.
- Does this only work with github actions? What about Devops and gitlab?