@Haijo7@snac.haijo.eu @radmin@limepeeps.perchinup.top To clarify the issue with machine learning models is not the training data.
It is the way the output is presented that is the problem. I've seen examples of Microsoft's Copilot outputting already existing GPL'd code line for line, including comments.
If you were to use the code that Copilot outputted in a project with a license not compatible with the GPL, you could be sued for copyright infringement.
Neural networks that aid in programming should be configurable in such a way that the user should be able to select the dataset. For example if you're working on a GPL project, the neural network should only return code that is compatible with the GPL.
And most importantly the neural network should list the copyright notices of all the people's code that were used to generate your answer. So you can give proper attribution if you decide to use it in your project.
Embed Notice
HTML Code
Corresponding Notice
- Embed this notice
SuperDicq (superdicq@minidisc.tokyo)'s status on Saturday, 02-Mar-2024 01:04:15 JST SuperDicq