@sofia yes. Calling for a lock down of training data based on copyright law would be a terrible thing for opensource models, imo. There's also a lot of research that goes into generated datasets and just curating leads to much better training results.