I wonder if Retroarch could be configured to send a screenshot of a video game and then have that converted into a machine-readable image of items around the screenshot, and the player's position, and then allow the player to choose an item to navigate to from there? Just like, a random idea. I don't know how much GPT4 vision API makes the screenshot smaller or downscales it.