Hey, in case their transphobia wasn't enough for you, @swheritage is yoinking all the code on GitHub -- regardless of license -- to train a generative AI that plagiarizes code.
No matter how many times they say "ethical", it isn't.
Hey, in case their transphobia wasn't enough for you, @swheritage is yoinking all the code on GitHub -- regardless of license -- to train a generative AI that plagiarizes code.
No matter how many times they say "ethical", it isn't.
@arborelia @swheritage Interesting! They appear to do license checks for this one. The repo I have which doesn't have a an open source license is not included there.
However, they do not operate with a list of allowed licenses - they've got a repo listed that uses a completely custom license (to prevent people from doing stupid shit with it) in there.
Their training data may not comply with the license.
This also makes me almost regret not putting hello.jpg on github, as I rehosted some code there that originally included it.
@swheritage To find out if they have appropriated your code, you can check "Am I in The Stack?": https://huggingface.co/datasets/bigcode/the-stack-v2
However, _do not believe their supposed opt-out_. I mean, sure, submit an opt-out if you want, but I know how they operate -- they'll just keep doing whatever they want and never process any takedowns unless the law makes them.
@arborelia @swheritage Clearly, they aren't talking to their IP lawyer enough.
@ryanc @swheritage Also, no language model is capable of obeying an attribution clause, which is in almost every license.
@ryanc @swheritage I'm already seeing in their list of opt-out GitHub issues that they've included some people's code that is "all rights reserved", and some people's GPL code.
https://github.com/ryancdotorg/goatsefloppy
copyright/license is
"Written by 2004-2005 kometbomb (and some other people, thanks to them) Feel free to treat like your own kids. Sicko."
which would make any competent lawyer scream
@arborelia @swheritage Is there content I can post to github that is illegal in France which also won't get me banned from github? 🤔
@arborelia @swheritage Also, this is associated with a university, which should have ethics people who are very risk averse...
@arborelia @swheritage Also, lest anyone think I don't really care about the copyright infringement...
GitHub's terms of service don't require that I allow copies of my code to be hosted there, only forks (which aren't really copies), and I've DMCA'd copies before.
https://github.com/github/dmca/blob/master/2021/08/2021-08-03-brainflyer.md
To the best of our knowledge, all files contained in the dataset are licensed with one of the permissive licenses (see list in Licensing information) or no license.
Emphasis mine.
What the cinnamon toast fuck?
GNU social JP is a social network, courtesy of GNU social JP管理人. It runs on GNU social, version 2.0.2-dev, available under the GNU Affero General Public License.
All GNU social JP content and data are available under the Creative Commons Attribution 3.0 license.