I am disappointed in Software Heritage.
They made this statement on using their archive as an AI training dataset: https://www.softwareheritage.org/2023/10/19/swh-statement-on-llm-for-code/?ref=openml.fyi
These seem like good principles. But they are not actually sufficient to respect our work. And the third is too weak, and appears to be providing a figleaf for extractive behavior.