Google said it discovered a zero-day vulnerability in the open-source SQLite database engine using a Large Language Model (LLM)-enabled framework called Big dream (formerly Project Naptime).
The tech giant described the development as the “first real-world vulnerability” discovered using an artificial intelligence (AI) agent.
“We believe this is the first public example of an AI agent detecting a previously unknown memory security issue in widely used real-world software,” Big Sleep Team said in a blog post shared with The Hacker News.
The vulnerability it’s about stack buffer underfilling in SQLite, which happens when a piece of software references a memory location before the memory buffer starts, causing it to crash or execute arbitrary code.
“This typically occurs when a pointer or its index is decremented to a position before the buffer, when pointer arithmetic results in a position before the beginning of a valid memory location, or when a negative index is used,” according to Common Weakness Enumeration. (CWE) description class of errors.
After responsible disclosure, the flaw was addressed as of early October 2024. It should be noted that the flaw was discovered in the library’s development branch, meaning it was flagged before the official release.
Project Naptime was first in detail Google in June 2024 as a technical framework for improving automated vulnerability detection approaches. It has since evolved into Big Sleep as part of a larger collaboration between Google Project Zero and Google DeepMind.
The idea behind Big Sleep is to use an artificial intelligence agent to simulate human behavior while detecting and demonstrating security vulnerabilities, taking advantage of LLM’s code comprehension and comprehension abilities.
This entails using a set of specialized tools that allow the agent to navigate the target codebase, run Python scripts in a sandbox to generate the fuzzing input, debug the program, and observe the results.
“We believe that this work has enormous defensive potential. “Finding vulnerabilities in software before it’s released means that attackers have no way to compete: vulnerabilities are patched before attackers have a chance to exploit them,” Google said.
The company, however, also emphasized that these are still experimental results, adding that “the Big Sleep team’s position is that it is currently quite likely that the target fuzzer will be at least as effective (at finding vulnerabilities).”