Interesting, there's experimental *local* alt text generation in Firefox Nightly https://hacks.mozilla.org/2024/05/experimenting-with-local-alt-text-generation-in-firefox-nightly/
In short: DistilGPT-2 model in Transformers.js, trained on updated Flickr30k dataset with supervised learning.
The copy used is also interesting: "Alt text (alternative text) helps when people can't see the image or when it doesn't load."