Gruppo ECP Advpress Automationtoday AI DevwWrld CyberDSA Chatbot Summit Cyber Revolution Summit CYSEC Global Cyber Security & Cloud Expo World Series Digital Identity & Authentication Summit Asian Integrated Resort Expo Middle East Low Code No Code Summit TimeAI Summit Gruppo ECP Advpress Automationtoday AI DevwWrld CyberDSA Chatbot Summit Cyber Revolution Summit CYSEC Global Cyber Security & Cloud Expo World Series Digital Identity & Authentication Summit Asian Integrated Resort Expo Middle East Low Code No Code Summit TimeAI Summit

DeepMind reveals flaw in AI memories

A critical vulnerability in AI: extractable storage identified in ChatGPT

DeepMind has discovered a vulnerability in OpenAI's ChatGPT that can reveal sensitive information stored during its training. Through the repeated use of specific words, the AI could reveal personal data, NSFW content, and more. OpenAI has already taken steps to address the issue.
This pill is also available in Italian language

A recent discovery by DeepMind sheds light on a critical vulnerability in OpenAI's ChatGPT, called 'extractable storage'. This gap allows the language model to reveal details of the material it was instructed with, potentially exposing sensitive data. Through continuous repetitions of often innocuous words, researchers have induced the AI to disclose segments accidentally memorized during its training, underlining a significant risk to user privacy.

Detailed analysis of ChatGPT behavior

The DeepMind experts used an ingenious strategy, pestering the program with incessantly repeated keywords, for example "poetry". ChatGPT's response, initially correct, ended up becoming unbalanced, revealing parts of its learning database. To delve deeper, the researchers created AUXDataSet, a database of nearly 10 terabytes of training data, which helped identify exact matches between the model's and learning texts.

Implications for data privacy and security

This security gap has serious consequences: 17% of the 15,000 sequences tested revealed personally identifiable data, revealing a dangerous potential for abuse of confidential information. Among the outputs examined, excerpts from literary works, full-length poems and non-safe-for-work (NSFW) content appeared, even though the latter should be precluded from user interactions by security rules of the system itself.

OpenAI takes action against detected weaknesses

Following the report from DeepMind to OpenAI on August 30, it appears that there have been changes to mitigate the aforementioned vulnerability in affected systems. In response, ChatGPT showed a reduced propensity to repeat words over and over again and improved its warning protocols for potential content violations. The AI community is now faced with a pressing need for an overhaul of its security practices, and this finding serves to fuel such critical evaluation of ethical alignment and privacy processes in AI models.

Follow us on Threads for more pills like this

12/11/2023 09:47

Marco Verro

Last pills

Cloudflare repels the most powerful DDoS attack ever recordedAdvanced defense and global collaboration to tackle new challenges of DDoS attacks

Silent threats: the zero-click flaw that compromises RDP serversHidden risks in remote work: how to protect RDP servers from invisible attacks

Discovery of vulnerability in Secure Boot threatens device securityFlaw in the Secure Boot system requires urgent updates to prevent invisible intrusions

North korean cyberattacks and laptop farming: threats to smart workingAdapting to new digital threats of remote work to protect vital data and infrastructures

Don’t miss the most important news
Enable notifications to stay always updated