@Scofisticated You mean, if a random user will do it? My personal take on that is, there will always be individuals and groups who'll do things someone doesn't like, but all they can do is access what is already public.
They can do it using any software or platform. If they're serious about it, they can develop from scratch, connect to a few ActivityPub relays, and data mine public posts. They don't exactly need to use existing platforms like Threads or Bluesky or Mastodon or GoToSocial, to mention a few.
There were incidents in the past (almost) 17 years of the Fediverse where tools/services were made to do something similar, most especially in searching/discovery of _public_ posts. So, yes, it doesn't have to be Threads. People who wants to data mine public content without the consent of the creator can do it without using existing platforms and software.
That brings us to, is it worth it to blanket ban users from Threads because of some random individuals who "might" data mine your content from that platform when they can do it with any other software or create their own?
I'm not defending Threads, just putting things into perspective. 🙂
A good example. If you're bridging to Bluesky, all posts created in the ATmosphere network is public. Last week, there was already an issue of someone who posted/collected a million or so record of posts. The user who did it was banned, and someone else did the same, with an even larger sample.
I'm not definding them but I think it's a good example of two things: public is public; and there will always be people who'll do things we don't like, like not asking us for permission. I'm not saying we should be passive, but within the frame of "should Threads be blocked because of this possibility", is it really worth doing a blanket ban / instance block, when the reason we have in mind can be done outside of that platform? 🙂
Again, I'm not defending anyone nor dismissing anyone's concern. 🖖🏽