@thomasfuchs VisionEncoderDecoder models are cool, especially the ocr- free . But that's less the LLM stochastic parrot bit, more, well, classical.