GNU social JP
  • FAQ
  • Login
GNU social JPは日本のGNU socialサーバーです。
Usage/ToS/admin/test/Pleroma FE
  • Public

    • Public
    • Network
    • Groups
    • Featured
    • Popular
    • People

Conversation

Notices

  1. Embed this notice
    pettter (pettter@mastodon.acc.umu.se)'s status on Thursday, 30-Nov-2023 19:18:21 JST pettter pettter
    in reply to
    • Alexandre Dulaunoy

    @adulau Of course not. The responsible thing, which Google notably didn't do, would be to say "there's a way to extract PII and verbatim data from LLMs, including ChatGPT" back in August.

    In conversation Thursday, 30-Nov-2023 19:18:21 JST from mastodon.acc.umu.se permalink
    • Embed this notice
      Alexandre Dulaunoy (adulau@infosec.exchange)'s status on Thursday, 30-Nov-2023 19:18:22 JST Alexandre Dulaunoy Alexandre Dulaunoy

      Extracting Training Data from ChatGPT

      I’m wondering if OpenAI requested a CVE for the disclosure of this vulnerability.

      #llm #llms #openai #vulnerability #chatgpt

      🔗 https://not-just-memorization.github.io/extracting-training-data-from-chatgpt.html

      🔗 https://arxiv.org/abs/2311.17035

      In conversation Thursday, 30-Nov-2023 19:18:22 JST permalink

      Attachments

      1. Domain not in remote thumbnail source whitelist: not-just-memorization.github.io
        Extracting Training Data from ChatGPT
      2. Domain not in remote thumbnail source whitelist: static.arxiv.org
        Scalable Extraction of Training Data from (Production) Language Models
        This paper studies extractable memorization: training data that an adversary can efficiently extract by querying a machine learning model without prior knowledge of the training dataset. We show an adversary can extract gigabytes of training data from open-source language models like Pythia or GPT-Neo, semi-open models like LLaMA or Falcon, and closed models like ChatGPT. Existing techniques from the literature suffice to attack unaligned models; in order to attack the aligned ChatGPT, we develop a new divergence attack that causes the model to diverge from its chatbot-style generations and emit training data at a rate 150x higher than when behaving properly. Our methods show practical attacks can recover far more data than previously thought, and reveal that current alignment techniques do not eliminate memorization.

Feeds

  • Activity Streams
  • RSS 2.0
  • Atom
  • Help
  • About
  • FAQ
  • TOS
  • Privacy
  • Source
  • Version
  • Contact

GNU social JP is a social network, courtesy of GNU social JP管理人. It runs on GNU social, version 2.0.2-dev, available under the GNU Affero General Public License.

Creative Commons Attribution 3.0 All GNU social JP content and data are available under the Creative Commons Attribution 3.0 license.