Conversation

Notices

Embed this notice
Aral Balkan (aral@mastodon.ar.al)'s status on Tuesday, 25-Mar-2025 20:18:42 JST Aral Balkan
in reply to
- Ed Summers
- Christian Peach
@edsu @chpietsch Was it naïve or a brilliant way to avoid regulation? Remember “do not track?” Ditto. I think naïve is thinking that the assholes in Big Tech don’t know exactly what they’re doing when they seek to avoid accountability. But hey, at this point, they have the world’s most lethal military behind them so I guess accountability is moot.

In conversation about a year ago from mastodon.ar.al permalink
- Embed this notice
  Ed Summers (edsu@social.coop)'s status on Tuesday, 25-Mar-2025 20:18:44 JST Ed Summers
  in reply to
  - Christian Peach
  @chpietsch yes, I guess you could look at it as naive. In many ways robots.txt was naive too. But one aspect to this is that we need ways for rights holders to assert their wishes, so that courts in jurisdictions that care (e.g. the EU) can use them as evidence. And there needs to be more nuance than what robots.txt provides:
  https://mailarchive.ietf.org/arch/msg/ai-control/EJ-84k8Zzh21vY1dHPZvDeYOLes/
  
  In conversation about a year ago permalink
- Embed this notice
  Christian Peach (chpietsch@fedifreu.de)'s status on Tuesday, 25-Mar-2025 20:18:45 JST Christian Peach
  in reply to
  - Ed Summers
  - BASE Search
  @edsu
  My experience with @base and other web services run by Bielefeld University Library is in line with @gluejar's.
  The IETF sound naive when they claim that “[r]ight now, AI vendors use a confusing array of non-standard signals in the robots.txt file (defined by RFC 9309) and elsewhere to guide their crawling and training decisions” when in reality many of them ignore whatever signals a website sends them. They even plunder the shadow libraries.
  
  In conversation about a year ago permalink
- Embed this notice
  Ed Summers (edsu@social.coop)'s status on Tuesday, 25-Mar-2025 20:18:46 JST Ed Summers
  
  GenAI bots are pushing websites into a corner that imperils open access, and perhaps worse, the web's historical record. From @gluejar:
  https://go-to-hellman.blogspot.com/2025/03/ai-bots-are-destroying-open-access.html
  Assuming that the web will continue to evolve instead of getting crushed underfoot, there is some interesting work going on over at the IETF about how to build on the now aged robots.txt protocol to allow rights holders to express how their content can be used online:
  https://www.ietf.org/blog/aipref-wg/
  
  In conversation about a year ago permalink

Public

Notices

Feeds