It is such a stupid and obvious market failure that nobody has made a consumer AI LLM product that is 1. trained on consensually-acquired material 2. powered with renewable energy 3. genuinely open about its weights and models. Just achieving these things and being creator-friendly would be massive.
Conversation
Notices
-
Embed this notice
Anil Dash (anildash@me.dm)'s status on Thursday, 12-Dec-2024 05:06:43 JST Anil Dash -
Embed this notice
mcc (mcc@mastodon.social)'s status on Thursday, 12-Dec-2024 05:49:13 JST mcc @anildash i was actually thinking, as an art project, of getting a solar panel and doing this with a collection of CC0 content. i decided not to do pursue this after seeing how things developed with OpenAI, on the grounds that if a true-open model existed, the proponents of closed/stolen models would point to my open model to go "see? AI doesn't have to be based on stolen content!" then continue using the stolen content.
Cassandra Granade 🏳️⚧️ repeated this. -
Embed this notice
mcc (mcc@mastodon.social)'s status on Thursday, 12-Dec-2024 05:49:13 JST mcc @anildash Put a different way, I think one reason this doesn't exist is that the presence of stolen material in LLM models is not a flaw, but the primary attraction. Copyright laundering is the core product.
If the users did not want to do copyright laundering, then the product might not even need the machine learning model at all, in that world a simple tag system might be adequate. The purpose the model serves in the system is to randomize the inputs enough to disguise the sources.
Seth :rebel: :fist_raised: ⁂ repeated this. -
Embed this notice
Cassandra Granade 🏳️⚧️ (xgranade@wandering.shop)'s status on Thursday, 12-Dec-2024 05:53:10 JST Cassandra Granade 🏳️⚧️ @mcc @anildash As an absolute layperson, it appears that there's this weird legal situation where if you cause harm to so very many people that it's impossible to tell exactly who is hurt by your actions, you basically get away with it because no one can prove in court that they *in particular* were hurt.
LLMs appear to be popular with capital owners largely on the basis that they can efficiently exploit this hack.
-
Embed this notice
Axomamma (axomamma@mastodon.online)'s status on Thursday, 12-Dec-2024 10:33:47 JST Axomamma @anildash I'm sorry, AI blows chunks except in very, very narrow use categories - NONE of which include search engines used by the general public. AI is an information wrecking ball.
-
Embed this notice
Anil Dash (anildash@me.dm)'s status on Thursday, 12-Dec-2024 10:33:47 JST Anil Dash @Axomamma I’m not suggesting a search engine?
-
Embed this notice
Anil Dash (anildash@me.dm)'s status on Thursday, 12-Dec-2024 10:35:51 JST Anil Dash @mcc I think about this a lot. The “then they’ll use it to justify the bad thing”. But they do that *anyway*, and we end up without the ethical thing. Like… we’re on Mastodon. You know who literally forked it to make a fascist network. They would have done that anyway! But this is still a thing of value.
-
Embed this notice
B O (piyuv@techhub.social)'s status on Thursday, 12-Dec-2024 10:36:15 JST B O @anildash if that model existed, it’d be worse than GPT-1, you need insane amount of data for good performance
-
Embed this notice
Anil Dash (anildash@me.dm)'s status on Thursday, 12-Dec-2024 10:36:15 JST Anil Dash @piyuv depends what it’s for. I’m not sure that’s correct.
-
Embed this notice
Anil Dash (anildash@me.dm)'s status on Thursday, 12-Dec-2024 10:36:58 JST Anil Dash @smn I don’t believe that’s true. I believe it might enable purpose-specific smaller models that are useful.
-
Embed this notice
Justin Fitzsimmons (smn@l3ib.org)'s status on Thursday, 12-Dec-2024 10:36:59 JST Justin Fitzsimmons @anildash the corporate models that are spending billions of dollars to harness a country's worth of power and boils the oceans to train on unethically sourced data results in a service that isn't appropriate for deployment as anything more sophisticated than a toy (despite how people are actually using them). An "organic" version would perform even worse than the corpo ones that are already unpopular and failing.
-
Embed this notice
Dr Kim Foale (kim@social.gfsc.studio)'s status on Thursday, 12-Dec-2024 13:05:16 JST Dr Kim Foale @mcc @anildash i think its very easy to argue that people who licensed their work under CC0 (or any CC for that matter) did not actively consent to having their work used to train LLMs, a technology that didn't exist (at least in the mainstream) at the time of licensing.
-
Embed this notice
Axomamma (axomamma@mastodon.online)'s status on Thursday, 12-Dec-2024 13:52:44 JST Axomamma @anildash I hate AI with heat of a thousand suns. It comes up as an issue for me primarily with searches. Does it matter whether you suggested search engines? It's a blight.
-
Embed this notice
Anil Dash (anildash@me.dm)'s status on Thursday, 12-Dec-2024 13:52:44 JST Anil Dash @Axomamma you should try arguing with whoever is advocating for that, then? like, it's good you're self-aware that you're not engaging in this conversation at an intellectual level.
-
Embed this notice