AI-generated code contains more bugs and errors than human output
AI-generated code produces 1.7x more issues than human code
- The average AI-generated pull request has 10.83 issues compared with 6.45 for human code, report claims
- Quality can be better in terms of typos, leaving room for human reviewers
- Microsoft code patches are up, but probably so is overall output
AI-generated code is actually prone to more vulnerabilities than human-generated code, raising questions over the reliability of some tools, new data from CodeRabbit has claimed.
Pull requests made with AI tools had an average of 10.83 issues, compared with 6.45 issues in human-generated pull requests, which is ultimately leading to longer reviews and the potential for more bugs to make it through to the finished product.
Besides having 1.7x more issues on general, AI-generated pull requests also had 1.4x more critical issues and 1.7x more major issues, so they're not just minor niggles.
AI-generated code isn't as secure as you might think
Logic and correctness errors (1.75x), code quality and maintainability (1.64x), security (1.57) and performance (1.42x) all saw higher than average code errors, with the report criticizing AI for introducing more serious bugs for humans reviewers to fix.
Some of the issues AI was most likely to introduce include improper password handling, insecure object references, XSS vulnerabilities and insecure deserialziation.
"AI coding tools dramatically increase output, but they also introduce predictable, measurable weaknesses that organizations must actively mitigate," CodeRabbit AI Director David Loker commented.
However, it's not necessarily a bad thing, with AI improving efficiency across the initial stages of code generation. The tech also introduced 1.76x fewer spelling errors and 1.32x fewer testability issues.
Sign up to the TechRadar Pro newsletter to get all the top news, opinion, features and guidance your business needs to succeed!
So although the study does highlight some of AI's flaws, it also serves the important purpose of demonstrating how humans and AI agents could interact with each other in the future. Rather than displacing human workers, we're seeing human work shift into AI management and reviewing – computers are just handling some of the tedious tasks that slow humans down in the first place.
Although Microsoft claims to have patched 1,139 CVEs in 2025, making it the second-highest year on record, that doesn't necessarily equate to a bad thing. With AI, developers are creating more code to begin with, so the total percentage of dodgy code may not be as bad as those figures initially suggest.
Then there's the fact that AI models, like OpenAI's GPT family, are constantly being improved to produce more accurate, less flawed results.
Follow TechRadar on Google News and add us as a preferred source to get our expert news, reviews, and opinion in your feeds. Make sure to click the Follow button!
And of course you can also follow TechRadar on TikTok for news, reviews, unboxings in video form, and get regular updates from us on WhatsApp too.
With several years’ experience freelancing in tech and automotive circles, Craig’s specific interests lie in technology that is designed to better our lives, including AI and ML, productivity aids, and smart fitness. He is also passionate about cars and the decarbonisation of personal transportation. As an avid bargain-hunter, you can be sure that any deal Craig finds is top value!
You must confirm your public display name before commenting
Please logout and then login again, you will then be prompted to enter your display name.
