GNU social JP
  • FAQ
  • Login
GNU social JPは日本のGNU socialサーバーです。
Usage/ToS/admin/test/Pleroma FE
  • Public

    • Public
    • Network
    • Groups
    • Featured
    • Popular
    • People

Embed Notice

HTML Code

Corresponding Notice

  1. Embed this notice
    emenel (emenel@post.lurk.org)'s status on Thursday, 21-Mar-2024 14:07:03 JSTemenelemenel
    • Hugging Face

    If you had code on GitHub at any point it looks like it might be included in a large dataset called “The Stack” — If you want your code removed from this massive “ai” training data go here:

    https://huggingface.co/spaces/bigcode/in-the-stack

    I found two of my old Github repos in there. Both were deleted last year and both were private. This is a serious breach of trust by Github and @huggingface.

    Remove all your code from Github.

    CONSENT IS NOT OPT-OUT.

    Edit — thanks for all the replies. More context here: https://hachyderm.io/@joeyh/112105744123363587

    Also the repos i found of mine i’m sure were private, but even if they were public at some point, for a brief time, in the past that isn’t my consent to use them for purposes beyond their intent.

    ---
    Edit 2 -- I see this made it to HN, which is a level of attention I do not want nor appreciate....

    For all those wondering about the private repo issue -- No, I am not 100% sure that these ancient repos weren't at some point public for a split second before I changed it. I do know that they were never meant for this and that one of them didn't even contain any code.

    If my accidentally making a repo public for a moment just so happened to overlap with this scraping, then I guess that's possible. But it in no way invalidates the issues, and the anger that i feel about it.

    In conversationabout a year ago from post.lurk.orgpermalink

    Attachments

    1. No result found on File_thumbnail lookup.
      see shy jo (@joeyh@hachyderm.io)
      from see shy jo
      I am disappointed in Software Heritage. They made this statement on using their archive as an AI training dataset: https://www.softwareheritage.org/2023/10/19/swh-statement-on-llm-for-code/?ref=openml.fyi These seem like good principles. But they are not actually sufficient to respect our work. And the third is too weak, and appears to be providing a figleaf for extractive behavior.
  • Help
  • About
  • FAQ
  • TOS
  • Privacy
  • Source
  • Version
  • Contact

GNU social JP is a social network, courtesy of GNU social JP管理人. It runs on GNU social, version 2.0.2-dev, available under the GNU Affero General Public License.

Creative Commons Attribution 3.0 All GNU social JP content and data are available under the Creative Commons Attribution 3.0 license.