Embed Notice

HTML Code

<blockquote style="position: relative; padding-left: 55px;"><section><a href="https://mastodon.social/users/reedmideke/statuses/110850327950536299">Reed Mideke (reedmideke@mastodon.social)'s status on Wednesday, 12-Jun-2024 20:51:52 JST</a><a href="https://mastodon.social/@reedmideke" title="reedmideke@mastodon.social"><img src="https://gnusocial.jp/avatar/108668-48-20230320145057.webp" width="48" height="48" alt="Reed Mideke" style="position: absolute; left: 0; top: 0;">Reed Mideke</a><div><a href="https://mastodon.social/@reedmideke/110843993869541439" rel="in-reply-to">in reply to</a></div></section><article><p>So @Toke@helvede.net points out (<a href="https://mastodon.social/@Toke@helvede.net/110848880977610283" rel="nofollow noreferrer">https://mastodon.social/@Toke@helvede.net/110848880977610283</a>) that <a href="https://mastodon.social/tags/OpenAI" rel="tag">#OpenAI</a> does claim to have unique user agent and honor robots.txt when scraping text for <a href="https://mastodon.social/tags/ChatGPT" rel="tag">#ChatGPT</a> <a href="https://mastodon.social/tags/AI" rel="tag">#AI</a> training. Not clear whether this is the only or even primary way publicly accessible web content gets into their training set though <a href="https://platform.openai.com/docs/gptbot" rel="nofollow noreferrer">https://platform.openai.com/docs/gptbot</a></p></article><footer><a rel="bookmark" href="https://gnusocial.jp/conversation/3230223#notice-6377199">In conversation</a><time datetime="2024-06-12T20:51:52+09:00" title="Wednesday, 12-Jun-2024 20:51:52 JST">about 6 months ago</time> <span>from <span><a href="https://mastodon.social/@reedmideke/110850327950536299" rel="external" title="Sent from mastodon.social via ActivityPub">mastodon.social</a></span></span><a href="https://mastodon.social/@reedmideke/110850327950536299">permalink</a><h4>Attachments</h4><ol><li><article><header><div>No result found on File_thumbnail lookup.</div><h5><a href="https://platform.openai.com/docs/gptbot">OpenAI Platform</a></h5><div></div></header><div>Explore developer resources, tutorials, API docs, and dynamic examples to get the most out of OpenAI's platform.</div><footer></footer></article></li><li><article><header><div>Domain not in remote thumbnail source whitelist: files.mastodon.social</div><h5><a href="https://mastodon.social/@Toke@helvede.net/110848880977610283">Mastodon</a></h5><div></div></header><div>The original server operated by the Mastodon gGmbH non-profit</div><footer></footer></article></li></ol></footer></blockquote>

Corresponding Notice

Embed this notice
Reed Mideke (reedmideke@mastodon.social)'s status on Wednesday, 12-Jun-2024 20:51:52 JST Reed Mideke
in reply to
So @Toke@helvede.net points out (https://mastodon.social/@Toke@helvede.net/110848880977610283) that #OpenAI does claim to have unique user agent and honor robots.txt when scraping text for #ChatGPT #AI training. Not clear whether this is the only or even primary way publicly accessible web content gets into their training set though https://platform.openai.com/docs/gptbot
In conversationabout 6 months ago from mastodon.socialpermalink
Attachments
1. No result found on File_thumbnail lookup.
  OpenAI Platform
  Explore developer resources, tutorials, API docs, and dynamic examples to get the most out of OpenAI's platform.
2. Domain not in remote thumbnail source whitelist: files.mastodon.social
  Mastodon
  The original server operated by the Mastodon gGmbH non-profit

Public

Embed Notice

HTML Code

Corresponding Notice