Embed Notice

HTML Code

<blockquote style="position: relative; padding-left: 55px;"><section><a href="https://blob.cat/objects/003b6fc3-0a49-4098-992a-57f59cd92476">iced depresso (icedquinn@blob.cat)'s status on Monday, 15-Jan-2024 13:07:09 JST</a><a href="https://blob.cat/users/icedquinn" title="icedquinn@blob.cat"><img src="https://gnusocial.jp/avatar/1514-48-20240327163110.webp" width="48" height="48" alt="iced depresso" style="position: absolute; left: 0; top: 0;">iced depresso</a><div><a href="https://shitposter.club/objects/b06830a1-fba9-4461-910e-2abc2e326da8" rel="in-reply-to">in reply to</a><ul><li><li><a href="https://gnusocial.jp/user/767" title="waifu@waifuism.life">bits :waifu:</a></li><li><a href="https://gnusocial.jp/user/1713" title="thendrix@social.hendrixgames.com">Terry Hendrix II 🏹</a></li></ul></div></section><article><a href="https://shitposter.club/users/Moon">@Moon</a> <a href="https://social.hendrixgames.com/users/thendrix">@thendrix</a> <a href="https://waifuism.life/users/waifu">@waifu</a> yeah what happens with that is your original input scrolls out of the context window.<br><br>you have to take active measures to deal with that. what measures those are is an open question.<br><br>mistral wants you to do some "sliding window" context which does something so every 4k tokens it crunches the context in to a new set of tokens that is supposed to somehow carry some context from before. so it still maintains the same size of context window, but it buffers itself with custom tokens to try and remember recent history.<br><br>ggml folk have some other experimental fuzzy/infinite context feature that was merged. i tried to find the PR but it took too long.<br><br>llamaindex et all instead do some fingerprinting of output context, store those sentences and such in to a vector database, and they do top-k retrieval so whenever you input new text they grab the most relevant sentences as 'hints' and stuff them in the prompts</article><footer><a rel="bookmark" href="https://gnusocial.jp/conversation/2589475#notice-5128953">In conversation</a><time datetime="2024-01-15T13:07:09+09:00" title="Monday, 15-Jan-2024 13:07:09 JST">Monday, 15-Jan-2024 13:07:09 JST</time> <span>from <span><a href="https://gnusocial.jp/notice/5128953" rel="external" title="Sent from gnusocial.jp via ActivityPub">gnusocial.jp</a></span></span><a href="https://gnusocial.jp/notice/5128953">permalink</a></footer></blockquote>

Corresponding Notice

Embed this notice
iced depresso (icedquinn@blob.cat)'s status on Monday, 15-Jan-2024 13:07:09 JSTiced depresso
in reply to
@Moon @thendrix @waifu yeah what happens with that is your original input scrolls out of the context window.

you have to take active measures to deal with that. what measures those are is an open question.

mistral wants you to do some "sliding window" context which does something so every 4k tokens it crunches the context in to a new set of tokens that is supposed to somehow carry some context from before. so it still maintains the same size of context window, but it buffers itself with custom tokens to try and remember recent history.

ggml folk have some other experimental fuzzy/infinite context feature that was merged. i tried to find the PR but it took too long.

llamaindex et all instead do some fingerprinting of output context, store those sentences and such in to a vector database, and they do top-k retrieval so whenever you input new text they grab the most relevant sentences as 'hints' and stuff them in the prompts
In conversationMonday, 15-Jan-2024 13:07:09 JST from gnusocial.jppermalink

Public

Embed Notice

HTML Code

Corresponding Notice