National Cyber Warfare Foundation (NCWF)

Research shows that AI-generated code is remarkably insecure. Yet experts tell CyberScoop it's up to industry to figure out a way to limit the issues the technology introduces.

The post Vibe coding is here to stay. Can it ever be secure? appeared first on CyberScoop.

Software powers the world, and soon, the bulk of the work making it may be done by machines.

As generative AI tools have gotten more proficient at coding, their use in software development has exploded. Proponents say the tools have made it dramatically easier for individual entrepreneurs or companies to create the kind of slick, professional-looking websites and applications that used to be achievable only to multimillion dollar enterprises, at a fraction of the cost.

But in many cases, this AI-directed approach to software development appears to come with real security tradeoffs.

“Vibe coding” is loosely defined as someone putting trust in AI’s capability to correctly develop software. Essentially, it means the developer “forgets that the code even exists,” leaving the bulk of the work to AI, as the human focuses on more abstract or higher-level problem-solving.

Experts remain deeply concerned about the cybersecurity weaknesses inherent in vibe coding. Yet nearly everyone CyberScoop spoke with for this story agreed on one thing: regardless of their feelings on the wisdom of the practice, software that is partially or entirely generated by AI is not going anywhere.

LLM tools are easy to use and so widely dispersed that security concerns are unlikely to slow down momentum when the technology allows for users with little technical background to build entire websites or applications with a few prompts. Casey Ellis, founder and advisor for Bugcrowd, called the broad adoption of vibe coding “inevitable,” even as he acknowledges the potential security pitfalls.

“To me, that’s just sort of the march of technology, full stop,” Ellis said. “I do think it’s fundamentally a good thing because … it gives more people access to be in a position where they can build stuff and the more ideas that are off the leash, the better for everyone, right?”

The flip side, he said, is that “speed is the natural enemy of quality and security and scalability” in software development.

He also posited that software built before the advent of generative AI, where workers routinely have high workloads and must meet unrealistic deadlines, has not exactly proven that code built solely by humans automatically leads to more secure software.

While secure coding practices are important, the total software attack surface is a probabilistic function of the number of lines of code that exist in the world. What LLMs do really well, Ellis noted, is “help people generate lots of lines of code very quickly.”

Many sources CyberScoop spoke with saw a difference between using an AI coding assistant during the development process and vibe coding.

The former is fast becoming standard practice in the software development world. A 2024 GitHub survey of 2,000 coders in four countries found that 97% reported using AI coding tools in their work.

Jack Cable, who left the Cybersecurity and Infrastructure Security Agency earlier this year to form a startup called Corridor focused on adding additional security layers to AI-coded applications, told CyberScoop that he has “definitely” seen an uptick over the past year in both AI-assisted and vibe-coded projects. That’s particularly true in the startup and tech worlds, where he said people tend to be more “AI forward.”

“There’s the hobbyists who are using some of these tools to generate websites without needing to know how to code at all … and then there’s more AI-development tools like Cursor, Copilot and some others that are more what companies are adopting,” he said.

In the GitHub survey, quality of AI-generated code was cited as one of the primary benefits, alongside efficiency, simplicity and the ability to leverage unfamiliar coding languages.

Microsoft has said that at least 50,000 organizations and more than 1 million developers have used GitHub Copilot. Other tech heavyweights like OpenAI, Google and Amazon have rolled out their own coding models, while smaller companies like Cursor, Bolt, Lovable and others have filled the marketplace with lower-cost generative AI coding software.

Despite concerns about security in other quarters, there was nearly universal expectation among coder respondents in the GitHub survey (99-100%) that AI adoption would lead to more secure software overall.

A 2024 survey of 2,000 coders found near-universal expectations that AI will improve software security. (Source: GitHub)

But those sentiments contrast with other research and data that has found major security problems in LLM-generated code, along with anecdotal stories of projects that relied almost entirely on AI-generated code, only to see their websites and apps quickly compromised through obvious, low-level vulnerabilities in the code.

For one, independent security researchers have challenged studies by Microsoft and GitHub showing significant improvements in code quality. Dan Cîmpianu, a software developer and skeptic of generative AI’s coding capabilities, noted that the research appears to have tilted the scales in Copilot’s favor by basing test results on a task — writing API endpoints for a web server — that he called “one of the most boring, repetitive, uninspired, and cognitively unchallenged aspects of development.”

That is just one of several examples of what Cîmpianu characterized as significant inaccuracies and overexaggerations around AI-coding proficiency in the study, including instances where graphs in the GitHub study — likely LLM-generated — failed to add up to 100%.

Notably, the enthusiasm to integrate more AI tools into the workflow — and the belief that doing so will make security easier — is coming from executive leadership far more than security practitioners who remain deeply skeptical, according to a study from Exabeam released in April.

Executives are significantly more likely to believe AI will increase cybersecurity than analysts and practitioners (Source: Exabeam)

But others have found a more muted impact. At the 2024 Black Hat cybersecurity conference, Veracode Chief Technology Officer Chris Wysopal told TechTarget that 41% of AI-generated code contained security vulnerabilities, on par with most human-generated code.

Ellis noted that the history of coding is one of perpetual technological innovation that makes the process easier and more accessible for people to use, from the early days of Assembly and machine code to when Grace Hopper developed the first compiler. Viewed through this lens, LLM-coding is just the latest step in lowering the barrier to entry.

Cable said that while he doesn’t think AI coding assistants “are going to put software developers out of work,” he does believe “there is a lot of potential here to move to a future where code is more secure by default.”

“But I do think there’s real work needed to get there,” he said.

The problem is that current AI coding technologies are nowhere near ready to handle such significant software development responsibility. BaxBench, a benchmark created by a group of developers to evaluate LLM-generated code, has found that nearly all major commercial and open-source models available today are deeply unreliable when it comes to producing code that is safe and ready for deployment.

For example, 62% of the software output from top LLM models was either incorrect or contained a security vulnerability. About half the code generated that did function correctly also contained exploitable flaws.

Today’s LLMs produce workable and secure code less than half the time with no additional security prompting. (Source: BaxBench)

Even providing more specific security instructions during prompting that were developed by “an unrealistic oracle that anticipates all security pitfalls” had a limited impact on this problem.

Extensive security prompting yields only marginal improvements in secure, workable LLM-generated code (Source: BaxBench)

People charged with overseeing and managing LLM-generated software projects do not fare much better.

Because LLMs are trained on vast repositories of human-generated code, they tend to reproduce many of the same vulnerabilities in their own products. Ellis said one meaningful difference between vibe-coded and human-generated applications is “you get a lot of artifacts that are left around and potentially vulnerable, but also can definitely inform someone that is doing a code-informed attack.”

In other words, LLMs can sometimes architect or structure their software code in ways that most humans wouldn’t, leaving them open to new kinds of attacks.

A hackathon in Poland this past April hosted 40 teams that built AI-agentic software solutions that were evaluated for security flaws in the code and workflow. The vast majority (80%) shipped their finished application without adding any additional security protections beyond what were already in the LLM guardrails. Some teams declined to use simple implementations for OpenAI guardrail agents who could spot potential vulnerabilities or weaknesses, because doing so made the LLM less accurate and blocked too many actions.

This may be “leading many teams to intentionally deprioritize security in favor of smoother user experience or faster prototyping,” wrote Dorian Granoša and Ante Gojsalić of SplxAI, which both sponsored the event and provided the company’s agentic radar tool to evaluate the team’s proposals.

Proponents of a vibe-coded future often argue that, when used right, AI tools can enhance the programming experience significantly, including on security. The concern is that the number of careful users is probably much smaller than the people who might use these tools without a thought about security.

Cable believes the ultimate solution to a world filled with both AI- and human-generated bugs is to update software development and security tools for this new reality. He believes existing security tools will not keep up with the pace of AI-generated software code, and that many human software developers lack the kind of training to develop secure code on their own.

“This is something that I think these development tools really ought to prioritize as well, and make sure that as they are generating code, [they are] assuming that the user isn’t very technical, doesn’t have the resources to catch bugs and adding more security guardrails in place,” he said.

The post Vibe coding is here to stay. Can it ever be secure? appeared first on CyberScoop.

Source: CyberScoop
Source Link: https://cyberscoop.com/vibe-coding-ai-cybersecurity-llm/

Vibe coding is here to stay. Can it ever be secure?

Comments