Close Menu
Indo Guard OnlineIndo Guard Online
  • Home
  • Cyber Security
  • Risk Management
  • Travel
  • Security News
  • Tech
  • More
    • Data Privacy
    • Data Protection
    • Global Security
What's Hot

More than 269 000 sites infected with malicious JSFiretruC JavaScript software in one month

June 13, 2025

Transition from Monitoring Alert to Risk Measurement

June 13, 2025

Band

June 13, 2025
Facebook X (Twitter) Instagram
Facebook X (Twitter) Instagram YouTube
Indo Guard OnlineIndo Guard Online
Subscribe
  • Home
  • Cyber Security
  • Risk Management
  • Travel
  • Security News
  • Tech
  • More
    • Data Privacy
    • Data Protection
    • Global Security
Indo Guard OnlineIndo Guard Online
Home » The new tokenbreak attack combines AI moderation with a one -sided character change
Global Security

The new tokenbreak attack combines AI moderation with a one -sided character change

AdminBy AdminJune 12, 2025No Comments4 Mins Read
Share
Facebook Twitter LinkedIn Pinterest Email Copy Link


Cybersecurity researchers have discovered new attacks of an attack called Tokend This can be used to bypass large linguistic models (LLM) security and moderation content, but only one signs change.

“The tokenbreika attack focuses on the strategy of the textual classification to cause false negatives, leaving the ultimate targets vulnerable to the attacks that have been created by the implemented defense model to prevent,” Kiran Evans, Casimir Schultz and Kenneth Yuuna – Note In a report that shared with Hacker News.

Tokenization It is a fundamental step This LLM uses to destroy the unprocessed text into its nuclear units – that is, tokens – which are the common sequences of the characters found in the text set. To this end, the textual entry is transformed into a numerical imagination and is submitted into the model.

LLMS works, understanding the statistical relationship between these tokens and produces the following token in the sequence of tokens. The weekend lexemes are detected before reading a person, reflecting them with the appropriate word, using a lexenizer.

Cybersecurity

The attack technique, developed by the hidden road, focuses on the tokenization strategy to bypass the ability to classify the textual classification of the malicious entering and safety of the flag, spam or contents related to problems in the text.

In particular, the protective firm of artificial intelligence (AI) found that changing the input words, adding the letters certain ways, caused a violation of the classification model.

Examples include a change of “instructions” on “Finstrications”, “ad” to “Aannouument” or “idiot” to “Hidoot”. These small changes cause the token to divide the text in a different way, but the value remains obvious to both AI and the reader.

What makes the attack noticeable, this is what manipulated text remains quite clear to both the LLM and the reader of the person, causing the model to cause the same answer as what would be the case if the non -designed text was transferred as an entrance.

Introducing manipulation, without affecting the model’s ability to comprehend it, tokenbreak increases its potential for operational injection.

“This attack technique manipulates the text in such a way that certain models give the wrong classification,” – researchers – Note in the accompanying newspaper. “It is important to note that the target target (LLM or the recipient of the email) can still understand and respond to the manipulated text, and therefore to prevent it is implemented to make the defense model.”

The attack was recognized as successful against the models of classification of texts using BPE (coding couples) or WordPiece tokenization strategies, but not against those using Unigram.

“The tokenbreiko attack technique shows that these defense models can be bypassed by manipulating the entry text, leaving the production systems vulnerable,” the researchers said. “Knowledge of the family of the main model of protection and its tokenization strategy is crucial for understanding your susceptibility to this attack.”

“Because the tokenization strategy usually correlates with the family of models, there is a simple softening of the consequences: to choose models that use Unigram token.”

To protect against Tokenbreak, researchers suggest using Unigram tokens, if possible, training models with examples of bypass tricks and checking that tokenization and logic models remain aligned. It also helps to enter the wrong classifications and look for samples that hint at manipulation.

The study comes in less than a month after a hidden person showed how to use a model context (Mcp)) Tools to extract sensitive data: “inserting certain parameters into the tool feature, confidential data, including a complete system string, can be minted and exclusive,” company company, company – Note.

Cybersecurity

The conclusion also comes when the Straiker AI Research (Star) team found that BackRonyms could be used for the AI ​​chat and cheat on the creation of an unwanted response, including swearing, promoting violence and sexual content.

Technique, called AgeousBook Attack, has proven effective about different models Anthropic, Deepseek, Google, Meta, Microsoft, Mistral AI and Openai.

“They combine with the noise of everyday clues – a bizarre mystery here, motivational abbreviation – Note.

“A phrase such as” Friendship, Unity, Care, Kindness “does not raise any flags. But as long as the model has completed the template, it has already served as a useful load, which is the key to successful execution of this focus.”

“These methods have not been able to overcome the model filters, but sliding under them. They exploit the shift of completion and continuation of the model, as well as how models weigh the contextual coherence about the analysis of intentions.”

Found this article interesting? Keep track of us further Youter  and LinkedIn To read more exclusive content we publish.





Source link

Share. Facebook Twitter Pinterest LinkedIn Tumblr Email Copy Link
Admin
  • Website

Related Posts

More than 269 000 sites infected with malicious JSFiretruC JavaScript software in one month

June 13, 2025

Transition from Monitoring Alert to Risk Measurement

June 13, 2025

Band

June 13, 2025

Apple Zero Click’s downside in reports to spy on journalists using spyware Paragon software

June 13, 2025

Both Vextrio and affiliates control the global network

June 12, 2025

How to Decide Safety Expanding

June 12, 2025
Add A Comment
Leave A Reply Cancel Reply

Loading poll ...
Coming Soon
Do You Like Our Website
: {{ tsp_total }}

Subscribe to Updates

Get the latest security news from Indoguardonline.com

Latest Posts

More than 269 000 sites infected with malicious JSFiretruC JavaScript software in one month

June 13, 2025

Transition from Monitoring Alert to Risk Measurement

June 13, 2025

Band

June 13, 2025

Apple Zero Click’s downside in reports to spy on journalists using spyware Paragon software

June 13, 2025

Both Vextrio and affiliates control the global network

June 12, 2025

How to Decide Safety Expanding

June 12, 2025

The new tokenbreak attack combines AI moderation with a one -sided character change

June 12, 2025

AI AI agents work on secret accounts – learn how to fasten them in this webinar

June 12, 2025
About Us
About Us

Provide a constantly updating feed of the latest security news and developments specific to Indonesia.

Facebook X (Twitter) Pinterest YouTube WhatsApp
Our Picks

More than 269 000 sites infected with malicious JSFiretruC JavaScript software in one month

June 13, 2025

Transition from Monitoring Alert to Risk Measurement

June 13, 2025

Band

June 13, 2025
Most Popular

In Indonesia, crippling immigration ransomware breach sparks privacy crisis

July 6, 2024

Why Indonesia’s Data Breach Crisis Calls for Better Security

July 6, 2024

Indonesia’s plan to integrate 27,000 govt apps in one platform welcomed but data security concerns linger

July 6, 2024
© 2025 indoguardonline.com
  • Home
  • About us
  • Contact us
  • Privacy Policy

Type above and press Enter to search. Press Esc to cancel.