I've wanted to write something deep and meaningful about this but I haven't found the words so here it is extempo—
Saw the headline, #StackOverflow is going to charge AI/LLM companies to use their database https://www.wired.com/story/stack-overflow-will-charge-ai-giants-for-training-data/
I'm not ok with this because I gave my answers on StackOverflow under a Creative Commons license to help as many folks as possible—I wasn't compensated for that labor except through Stack Overflow building and maintaining a nice effective repository for knowledge.
Today, "helping as many coders as possible" means giving my Q&A contributions to folks training LLMs. Stack Overflow charging for that feels like rent seeking.
If there was a way for me to mark my answers as "OK for LLM training", I'd do that—for example, re-release my contribution under Public Domain (i.e., "CC0" instead of the Creative Commons with Attribution and Sharealike license that Stack Overflow contributions default to).
I don't expect this to be controversial. I can see folks upset that LLMs are trained on your open-source code (I personally have released all my open source software into the public domain, but I'm incredibly privileged to not need funding from open source). But Q&A content seems different—the intention was to provide uncompensated help to the person asking the question and future visitors.
GNU social JP is a social network, courtesy of GNU social JP管理人. It runs on GNU social, version 2.0.2-dev, available under the GNU Affero General Public License.
All GNU social JP content and data are available under the Creative Commons Attribution 3.0 license.