“The files within this data set are not scripts, exactly. Rather, they are subtitles taken from a website called http://opensubtitles.org. Users of the site typically extract subtitles from DVDs, Blu-ray discs, and internet streams using optical-character-recognition (OCR) software. “
The Hollywood AI Database https://www.theatlantic.com/technology/archive/2024/11/opensubtitles-ai-data-set/680650/