Cybersecurity researchers have discovered that large-scale language models (LLMs) can be used to create new variants of malicious JavaScript code at scale in a way that better evades detection.
“Although it is difficult for undergrads to create malware from scratch, criminals can easily use it to rewrite or obfuscate existing malware, making it more difficult to detect,” Palo Alto Networks Unit 42 researchers said in a new analysis. “Criminals can trick LLM into performing transformations that look much more natural, making detection of this malware more difficult.”
With enough transformations over time, this approach can have the advantage of degrading the performance of malware classification systems, tricking them into believing that a piece of malicious code is actually benign.
While LLM providers are increasingly building guardrails to prevent them from going off the rails and creating an unintended outcome, bad actors are touting tools like WormGPT as a way to automate the process of crafting convincing phishing emails aimed at promising targets and even to create new malware.
Back in October 2024, OpenAI opened it blocked more than 20 operations and phishing networks that tried to use its platform for reconnaissance, vulnerability research, scripting and debugging.
Unit 42 said it used LLM’s capabilities to iteratively rewrite existing malware samples in order to evade detection by machine learning (ML) models such as “Innocent until proven guilty” (IUPG) or PhishingJSeffectively paving the way for 10,000 new JavaScript variants to be created without changing functionality.
Competing machine learning techniques are designed to transform malware using a variety of techniques, namely, renaming variables, splitting lines, inserting unwanted code, removing unnecessary whitespace, and completely re-implementing the code every time it is fed to the system as input.
“The end result is a new variant of malicious JavaScript that retains the same behavior of the original script, but almost always has a much lower malware score,” the company said, adding that the greedy algorithm flipped its own malware classifier model’s verdict from malicious to benign in 88% of cases.
Worse, such rewritten JavaScript artifacts also evade detection by other malware scanners when uploaded to the VirusTotal platform.
Another important advantage offered by LLM-based obfuscation is that its multiple rewrites look much more natural than those achieved by libraries such as obfuscator.io, the latter of which are easier to reliably detect and fingerprint due to , as they make changes to the source code.
“The scale of new variants of malicious code can increase with generative AI,” said Unit 42. “However, we can use the same tactics to rewrite malicious code to help create training data that can improve the reliability of ML models.”
The disclosure came as a team of researchers from North Carolina State University developed a side-channel attack called TPUXtract perform model-stealing attacks on the Google Edge Tensor processing units (TPU) with an accuracy of 99.91%. This can then be used to facilitate intellectual property theft or subsequent cyber attacks.
“In particular, we show a hyperparameter stealing attack that can mine all layer configurations, including layer type, number of nodes, kernel/filter sizes, number of filters, steps, padding, and activation function,” the researchers said. “In particular, our attack is the first comprehensive attack that can mine never-before-seen models.”
A black-box attack essentially captures electromagnetic signals emitted by the TPU when neural network inference is performed—a consequence of the computational intensity involved in running standalone ML models—and uses them to infer the model’s hyperparameters. However, this depends on the adversary having physical access to the target device, not to mention possessing expensive research and trace acquisition equipment.
“Because we stole the architecture and details of the layers, we were able to recreate the high-level functions of artificial intelligence,” said Aydin Aysu, one of the study’s authors. said. “We then used this information to recreate a functional model of the artificial intelligence, or a very close surrogate of that model.”