Screencap from linked article, reading "OpenAI said the 250,000-word casebook used for the study was more than twice the length of text that its GPT-4o model can process at once. Anthropic said the study had limited usefulness because it did not compare the A.I. with human performance. Google said its model accuracy had improved since the study was conducted."
https://cdn.masto.host/daircommunitysocial/media_attachments/files/114/643/154/566/951/555/original/5393da3d7474f6ac.png