@eaton
I guess I was thinking that to "reach in" you could just process the inter layer values of a select few of the LLM layers by masking or compressing them down to a lower matrix size. Do that at a few points along the LLM, and use them as the input to a lower parameter network you training (after the LLM is trained) to predict the LLM time to respond.
Notices by Daniel (ergo42@mastodon.gamedev.place)
-
Embed this notice
Daniel (ergo42@mastodon.gamedev.place)'s status on Monday, 12-Aug-2024 03:28:14 JST Daniel -
Embed this notice
Daniel (ergo42@mastodon.gamedev.place)'s status on Monday, 12-Aug-2024 03:25:20 JST Daniel @irwin
Exactly!Context matters. But maybe it's not as simple as I assume. How do you define confidence against truth when the system doesn't understand reality vs fiction.
... the only neural networks I've made surmount to a kid playing in a sandbox next to the construction of a dyson sphere. I can imagine the goal and that dirt goes here, but the science of AI is moving so fast, I couldn't even tell you why modern LLMs are so functional.
-
Embed this notice
Daniel (ergo42@mastodon.gamedev.place)'s status on Monday, 12-Aug-2024 03:17:05 JST Daniel @eaton
Well said.Admittedly, my experience with training and designing neural networks is limited, but wouldn't it be only slightly more computation to create a high level meta neural network that reaches into the LLM and estimates some attributes of the current LLM context, by attaching to specific neuron outputs?
The attributes could be level of confidence through time, expected response time, etc. Then just create a UI program from the output of that meta neural net realtime value
-
Embed this notice
Daniel (ergo42@mastodon.gamedev.place)'s status on Tuesday, 06-Feb-2024 15:20:23 JST Daniel @geerlingguy i had that often at work. Past Daniel was very helpful today when I realized the work was already done. So joyful.