Conversation

Notices

Embed this notice
Peter (peter@thepit.social)'s status on Thursday, 29-May-2025 20:18:35 JST Peter

this is interesting, but i don't quite agree. i don't think this is model collapse, per se. i believe when you do "search" with an LLM, what you are actually doing is RAG: they are not constantly re-training their models on the online content they added to their index over the last 48hrs, they are querying their vectorized index of that content with your vectorized search terms, dumping that context into the LLM, and returning a long, chatty result. https://www.theregister.com/2025/05/27/opinion_column_ai_model_collapse/
In conversation about a month ago from thepit.social permalink
Attachments
1. Untitled attachment
  https://cdn.masto.host/thepitsocial/media_attachments/files/114/584/562/657/394/953/original/87dda4279aec43d3.png
2. Domain not in remote thumbnail source whitelist: regmedia.co.uk
  
  AI model collapse is not what we paid for
  
  Opinion: Prediction: General-purpose AI could start getting worse
- Embed this notice
  Peter (peter@thepit.social)'s status on Thursday, 29-May-2025 20:18:13 JST Peter
  in reply to
  
  people try to draw comparisons to the printing press or whatever, but the **scale** of what computers are able to produce is insane. millions of GPUs, each one capable of producing millions of words of language per day given a steady supply of electricity. i'll be shocked if internet search is usable at all within 18 months.
  
  In conversation about a month ago permalink
- Embed this notice
  Peter (peter@thepit.social)'s status on Thursday, 29-May-2025 20:18:16 JST Peter
  in reply to
  
  this is what's at risk with LLM-generated content, and we badly need some kind of guidelines and a project to archive original, human-produced knowledge before it's too late and it becomes impossible to extract it from an ocean of random language.
  
  In conversation about a month ago permalink
  
  Aral Balkan repeated this.
- Embed this notice
  Peter (peter@thepit.social)'s status on Thursday, 29-May-2025 20:18:21 JST Peter
  in reply to
  
  if you have a library of 10,000 precious books containing thousands of years of human knowledge, and that library burns down, it's a tragedy. and it's the **exact same effect** if, instead of burning them, you mix those 10,000 books into a sea of 1,000,000,000 books that look exactly like them but contain fabricated content with no traceable record of who created them or why.
  
  In conversation about a month ago permalink
- Embed this notice
  Peter (peter@thepit.social)'s status on Thursday, 29-May-2025 20:18:24 JST Peter
  in reply to
  
  when biologists cloned sheep in the 1990s, it was a red alert for bio-ethics, and some pretty strict rules were set for what you can do with a human genome, because everyone knew that once some lab-created horror made it out into the population, there was no way to fix it.
  i feel like we are at a similar moment with the ecosystem of human knowledge and no one is talking about it like the civilizational emergency it is.
  
  In conversation about a month ago permalink
  
  Aral Balkan repeated this.
- Embed this notice
  Peter (peter@thepit.social)'s status on Thursday, 29-May-2025 20:18:28 JST Peter
  in reply to
  
  the original task of internet search engines was to help users find a needle in a haystack. that was a solvable problem, and they were very good at it **because the haystack was finite** but now LLMs are generating an infinite haystack of slop and deliberate misinformation.
  
  In conversation about a month ago permalink
- Embed this notice
  Peter (peter@thepit.social)'s status on Thursday, 29-May-2025 20:18:32 JST Peter
  in reply to
  
  what we're seeing is actually far worse. it's a general **epistemological collapse**. the open web is getting filled up with garbage LLM content to the point that it is becoming difficult to find useful results, regardless of whether it's an LLM search or a regular search.
  
  In conversation about a month ago permalink

Public

Notices

Feeds