GNU social JP
  • FAQ
  • Login
GNU social JPは日本のGNU socialサーバーです。
Usage/ToS/admin/test/Pleroma FE
  • Public

    • Public
    • Network
    • Groups
    • Featured
    • Popular
    • People

Embed Notice

HTML Code

Corresponding Notice

  1. Embed this notice
    talby (talby@techhub.social)'s status on Monday, 29-Jan-2024 14:20:29 JSTtalbytalby
    in reply to
    • Paul Cantrell

    @inthehands I think LLMs mostly use https://commoncrawl.org/ rather than crawling the web themselves. The Internet Archive's Wayback Machine uses Common Crawl as a source and has Wookieepedia so I think it's likely in there already.

    In conversationMonday, 29-Jan-2024 14:20:29 JST from techhub.socialpermalink

    Attachments

    1. No result found on File_thumbnail lookup.
      Common Crawl - Open Repository of Web Crawl Data
      We build and maintain an open repository of web crawl data that can be accessed and analyzed by anyone.
  • Help
  • About
  • FAQ
  • TOS
  • Privacy
  • Source
  • Version
  • Contact

GNU social JP is a social network, courtesy of GNU social JP管理人. It runs on GNU social, version 2.0.2-dev, available under the GNU Affero General Public License.

Creative Commons Attribution 3.0 All GNU social JP content and data are available under the Creative Commons Attribution 3.0 license.