Embed Notice

HTML Code

<blockquote style="position: relative; padding-left: 55px;"><section><a href="https://fedi.simonwillison.net/users/simon/statuses/110318869081377079">Simon Willison (simon@fedi.simonwillison.net)'s status on Sunday, 07-May-2023 03:35:25 JST</a><a href="https://fedi.simonwillison.net/@simon" title="simon@fedi.simonwillison.net"><img src="https://gnusocial.jp/avatar/18625-48-20221108165340.webp" width="48" height="48" alt="Simon Willison" style="position: absolute; left: 0; top: 0;">Simon Willison</a><div><a href="https://social.coop/@resing/110318805896248959" rel="in-reply-to">in reply to</a><ul><li><li><a href="https://gnusocial.jp/user/117226" title="jimgar@fosstodon.org">Jim Gardner</a></li></ul></div></section><article><p><a href="https://social.coop/@resing">@resing</a> <a href="https://fosstodon.org/@jimgar">@jimgar</a> I'm not convinced it's possible to train a usable LLM without including copyrighted material in they raw pretraining data</p><p>As such, personally think it's a necessary evil to avoid a monopoly on LLM technology belonging to organizations that are willing to train against crawler data</p></article><footer><a rel="bookmark" href="https://gnusocial.jp/conversation/1439221#notice-2934270">In conversation</a><time datetime="2023-05-07T03:35:25+09:00" title="Sunday, 07-May-2023 03:35:25 JST">Sunday, 07-May-2023 03:35:25 JST</time> <span>from <span><a href="https://fedi.simonwillison.net/@simon/110318869081377079" rel="external" title="Sent from fedi.simonwillison.net via ActivityPub">fedi.simonwillison.net</a></span></span><a href="https://fedi.simonwillison.net/@simon/110318869081377079">permalink</a></footer></blockquote>

Corresponding Notice

Embed this notice
Simon Willison (simon@fedi.simonwillison.net)'s status on Sunday, 07-May-2023 03:35:25 JSTSimon Willison
in reply to
- Tom Resing
- Jim Gardner
@resing @jimgar I'm not convinced it's possible to train a usable LLM without including copyrighted material in they raw pretraining data
As such, personally think it's a necessary evil to avoid a monopoly on LLM technology belonging to organizations that are willing to train against crawler data
In conversationSunday, 07-May-2023 03:35:25 JST from fedi.simonwillison.netpermalink

Public

Embed Notice

HTML Code

Corresponding Notice