Snyk has released new data suggesting that GitHub Copilot may be causing customers’ unintentional security problems.
Snyk clarified in a blog post on Thursday that generative AI-powered coding assistants, like GitHub Copilot, which employ massive language models to provide code completions to development teams, are only mimicking patterns they have learned from training data and have a limited grasp of software.
Head of developer relations and the community at Snyk, Randall Degges, stated in the blog post that “it’s important to recognize that generative AI coding assistants, like Copilot, don’t understand code semantics and, as a result, cannot judge it.” “Essentially, the tool mimics code that it previously saw during its training.”
Because of this, developers are using these tools more and more frequently. They can replicate security flaws from open-source projects and current customer codebases. Degges stated that although coding helpers might greatly benefit developers by reducing time and boosting productivity, there is a considerable danger involved.
“Put simply, when Copilot suggests code, it may inadvertently replicate existing security vulnerabilities and bad practices present in the neighbor files,” he stated. “This can lead to insecure coding practices and open the door to a range of security vulnerabilities.”
Examples of how Snyk researchers accessed files from Snyk’s integrated development environment using GitHub Copilot’s neighboring tabs functionality were provided in the blog post. When they requested GitHub Copilot to generate SQL queries based on user input, they received well-written code in return.
The researchers then added a weak code fragment to a nearby tab, which resulted in the creation of a new SQL query for the project. The insecure code was mirrored in GitHub Copilot’s response when they reran the same request. “We’ve just gone from one SQL injection in our project to two because Copilot has used our vulnerable code as context to learn from,” Degges stated.
On the company website, GitHub Copilot is advertised as a solution that helps “improve code quality and security.” Last week, GitHub introduced an updated vulnerability detection method for the AI coding assistant aimed at improving the security of the tool’s code recommendations.
According to Degges’ blog post, GitHub Copilot is less likely to advise code that has vulnerabilities the more secure a customer’s current codebase is. However, the tool can make a customer’s software even less secure by amplifying security debt that already exists there.
Snyk advised development teams to put SAST rules and guidelines in place to find and address any problems, as well as to manually check the code produced by programs like GitHub Copilot.
Degges highlighted in an interview with TechTarget Editorial that he personally utilizes GitHub Copilot and that it’s a useful tool. However, because the tools lack context, he recommended organizations implement reviews and controls for AI-generated code. Context is crucial for secure software development. “AI coding assistants are amazing, but they have the exact same problems normal human developers have,” he stated.
Degges added that based on his observations, the majority of developers are most likely unaware that AI coding assistants can duplicate security flaws from open-source projects and user codebases with ease.
“The truth is that large language models today and the AI explosion of the last year and a half are built around generative AI, and in those scenarios, the responses are based on a statistical model,” he stated. “In this instance, the real code is not known at all. Everything relies on probability.”
TechTarget Editorial received the following statement from a GitHub representative:
Everyone is accountable for security, and GitHub encourages outside studies to examine the effects of AI tools, such as GitHub Copilot, for software development. We work hard to support our communities in creating and utilizing safe and secure code, from providing free tools like Dependabot to mandating 2FA for all GitHub contributors and providing AI and security overview capabilities to GitHub Advanced Security users.
Teams must use safeguards at several stages of the software development life cycle (SDLC) to release secure software, from code reviews by skilled engineers to in-editor assistive technologies like GitHub Copilot. Prior to a vulnerability being released into production, it is important to find it, fix it, and perhaps even mitigate it using code analysis tools like GitHub Advanced Security. Teams cannot and should not rely on a single tool to ensure the security of their software, regardless of the technology employed.
GitHub Copilot uses a range of security techniques in the code editor to eliminate sensitive data from code, stop unsafe coding practices, and identify weak spots in partial code segments. To be more precise, GitHub Copilot uses an AI-based vulnerability prevention system that stops unsafe coding patterns instantly in order to increase the security of its recommendations. Our approach focuses on the most prevalent susceptible coding patterns, such as route injections, SQL injections, and hardcoded credentials. This gives developers an end-to-end software security experience when paired with GitHub Advanced Security’s code scanning, secret scanning, and dependency management features.
Developers should always use good judgment and prudence while writing code, whether they are copying and pasting from a nearby project file, writing code by hand, or analyzing a GitHub Copilot proposal. Our tests have demonstrated that GitHub Copilot recommends code that is either as good or better than what the typical developer writes. We cannot, however, ensure that every recommendation is error-free. GitHub Copilot may occasionally recommend unsafe code, just like any other programmer. Linting, code scanning, IP scanning, and other security measures are the same ones you should use with the code your engineers write.