@kaia This looks very nice and have always wanted to use it. Just a shame all my notes just turn into link dumps and become a ADHD mess. So just gave up on them and throw links in Wallabag and tag it. Not what I think you were looking for answer wise but my two cents all the same.
@lain Yeah like I mentioned before it is a ongoing thing and most LLMs are just not cut out for it since it is extracting Japanese text in most cases. It was not till I started playing with the Qwen2.5-VL models that I thought that this could actually work to some degree.
@sun Though in reality this is just a lame python GUI front end for interacting with a Qwen2.5 vision model over an api with folder watching. But I just like the concept and thought it might be fun to try and play Doki Doki Pretty League.
@quad A tool I am working on that uses a low powered LLM (running locally or on a api) to extract the Japanese text from a image and translate it. The main feature is you can set a directory for it to watch and anytime a new image enters it auto sends it off to the LLM to get processed. So if you are playing something in #RetroArch or another emulator you have this side by side and press the screenshot button when you need a translation. I think I posted a video of a earlier version a week or so ago here: https://melonbread.dev/notice/AscmWuE6CCbWpvsSZM
Still working some things but will share the repo soon once I work a couple more things out.
Just a lonely boy living on his own instance (Now with SO).Natural screw up, who often can not make a proper post.Normally I just post about whatever my ADHD hyper focus at the moment is.I deal in Random Access Memory & Random Access Memory Accessories :grans-perspective: Alt account (when I break my instance):@rain@shitposter.club