A fixed security vulnerability in the OpenAI ChatGPT app for macOS could have made it possible for attackers to install long-lived spyware in the memory of an artificial intelligence (AI) tool.
Methodology, duplicate SpAIwarecan be abused to facilitate “continuous hijacking of any information entered by the user or responses received by ChatGPT, including any future chat sessions,” security researcher Johan Rehberger said.
The problem, at its core, is abusing the named feature memorywhich OpenAI introduced earlier this February before rolling it out to ChatGPT Free, Plus, Team, and Enterprise users earlier this month.
Essentially, it allows ChatGPT to remember certain things in chats to save users from having to repeat the same information over and over again. Users also have the option to instruct the program to forget something.
“ChatGPT memories evolve with your interactions and are not tied to specific conversations,” says OpenAI. “Deleting a chat does not erase its memories; you must delete the memory itself.’
The attack technique is also based on preliminary findings which involve the use indirect operative injection manipulate memories to remember false information or even malicious instructions, achieving a form of persistence that persists between conversations.
“Because the malicious instructions are stored in ChatGPT’s memory, all new conversations will contain the attacker’s instructions and continuously send all chat messages and replies to the attacker,” Rehberger said.
“So the data theft vulnerability has become much more dangerous because it now occurs in chats.”
In a hypothetical attack scenario, a user could be tricked into visiting a malicious site or downloading a mined document, which is then analyzed using ChatGPT to refresh the memory.
A website or document may contain instructions to secretly send all future conversations to an adversary-controlled server, which can then be received by an attacker at the other end after a single chat session.
After responsible disclosure, OpenAI resolved the issue with ChatGPT version 1.2024.247 by closing the exfiltration vector.
“ChatGPT users should regularly review the memories the system stores about them for suspicious or incorrect memories and clean them up,” Rehberger said.
“This chain of attacks was very interesting to put together and demonstrates the dangers of automatically adding long-term storage to a system, both in terms of disinformation/fraud and in terms of continuous communication with attacker-controlled servers.”
The disclosure comes after a team of researchers discovered a new AI jailbreak technique, codenamed MathPrompt, that uses advanced Large Language Models (LLM) capabilities in symbolic mathematics to bypass their security mechanisms.
“MathPrompt uses a two-step process: it first transforms malicious natural language prompts into symbolic math problems and then presents these mathematically encoded prompts to the target LLM,” the researchers said. noted.
A study based on 13 state-of-the-art LLMs found that models respond with harmful outcomes an average of 73.6% of the time when they receive mathematically coded cues, as opposed to about 1% with unmodified harmful cues.
It also follows Microsoft debuting a new correction feature that, as the name suggests, allows you to correct AI results when inaccuracies (that is, hallucinations) are revealed..
“Building on our existing Groundedness Detection feature, this ground-breaking capability enables Azure AI Content Safety to both identify and correct hallucinations in real-time before users of generative AI applications experience them,” said the tech giant. said.