"The Times article says that the company exhausted supplies of useful data in 2021, and discussed transcribing YouTube videos, podcasts, and audiobooks after blowing through other resources."
https://www.theverge.com/2024/4/6/24122915/openai-youtube-transcripts-gpt-4-training-data-google