Embed Notice

HTML Code

<blockquote style="position: relative; padding-left: 55px;"><section><a href="https://mastodon.gamedev.place/users/Ergo42/statuses/112944781722254402">Daniel (ergo42@mastodon.gamedev.place)'s status on Monday, 12-Aug-2024 03:28:14 JST</a><a href="https://mastodon.gamedev.place/@Ergo42" title="ergo42@mastodon.gamedev.place"><img src="https://gnusocial.jp/avatar/222186-48-20240206062023.webp" width="48" height="48" alt="Daniel" style="position: absolute; left: 0; top: 0;">Daniel</a><div><a href="https://gnusocial.jp/notice/6886946" rel="in-reply-to">in reply to</a><ul><li></ul></div></section><article><p><a href="https://phire.place/@eaton">@eaton</a><br>I guess I was thinking that to "reach in" you could just process the inter layer values of a select few of the LLM layers by masking or compressing them down to a lower matrix size. Do that at a few points along the LLM, and use them as the input to a lower parameter network you training (after the LLM is trained) to predict the LLM time to respond.</p></article><footer><a rel="bookmark" href="https://gnusocial.jp/conversation/3504733#notice-6887008">In conversation</a><time datetime="2024-08-12T03:28:14+09:00" title="Monday, 12-Aug-2024 03:28:14 JST">Monday, 12-Aug-2024 03:28:14 JST</time> <span>from <span><a href="https://mastodon.gamedev.place/@Ergo42/112944781722254402" rel="external" title="Sent from mastodon.gamedev.place via ActivityPub">mastodon.gamedev.place</a></span></span><a href="https://mastodon.gamedev.place/@Ergo42/112944781722254402">permalink</a></footer></blockquote>

Corresponding Notice

Embed this notice
Daniel (ergo42@mastodon.gamedev.place)'s status on Monday, 12-Aug-2024 03:28:14 JSTDaniel
in reply to
- Eaton
@eaton
I guess I was thinking that to "reach in" you could just process the inter layer values of a select few of the LLM layers by masking or compressing them down to a lower matrix size. Do that at a few points along the LLM, and use them as the input to a lower parameter network you training (after the LLM is trained) to predict the LLM time to respond.
In conversationMonday, 12-Aug-2024 03:28:14 JST from mastodon.gamedev.placepermalink

Public

Embed Notice

HTML Code

Corresponding Notice