Embed Notice

HTML Code

<blockquote style="position: relative; padding-left: 55px;"><section><a href="https://blob.cat/objects/9da0b188-1085-4cea-aedd-f002e5793206">iced depresso (icedquinn@blob.cat)'s status on Wednesday, 22-May-2024 14:31:23 JST</a><a href="https://blob.cat/users/icedquinn" title="icedquinn@blob.cat"><img src="https://gnusocial.jp/avatar/1514-48-20240327163110.webp" width="48" height="48" alt="iced depresso" style="position: absolute; left: 0; top: 0;">iced depresso</a><div><a href="https://gleasonator.com/objects/a2e8cf58-577b-46d3-ba43-7120f8444a3d" rel="in-reply-to">in reply to</a><ul><li><li><a href="https://gnusocial.jp/user/252111" title="gentoobro@gleasonator.com">gentoobro</a></li></ul></div></section><article><a href="https://gleasonator.com/users/gentoobro">@gentoobro</a> <a href="https://shortstacksran.ch/users/hazlin">@hazlin</a> well, transformers have awful performance numbers. mamba networks are subquadratic. they just struggle with if they don't admit facts in to their state window then they can't remember it at any point, since the whole thing with state space models is they have to evaluate what to carry forward and what to drop.<br><br>but none of this addresses the artificial hippocampus element, which is whats needed to actually make random ML shit in to a AI :blobcatdunno: <br><br>i had some theories.</article><footer><a rel="bookmark" href="https://gnusocial.jp/conversation/3126294#notice-6191465">In conversation</a><time datetime="2024-05-22T14:31:23+09:00" title="Wednesday, 22-May-2024 14:31:23 JST">about 6 months ago</time> <span>from <span><a href="https://gnusocial.jp/notice/6191465" rel="external" title="Sent from gnusocial.jp via ActivityPub">gnusocial.jp</a></span></span><a href="https://gnusocial.jp/notice/6191465">permalink</a></footer></blockquote>

Corresponding Notice

Embed this notice
iced depresso (icedquinn@blob.cat)'s status on Wednesday, 22-May-2024 14:31:23 JSTiced depresso
in reply to
- hazlin no plap pirate
- gentoobro
@gentoobro @hazlin well, transformers have awful performance numbers. mamba networks are subquadratic. they just struggle with if they don't admit facts in to their state window then they can't remember it at any point, since the whole thing with state space models is they have to evaluate what to carry forward and what to drop.

but none of this addresses the artificial hippocampus element, which is whats needed to actually make random ML shit in to a AI :blobcatdunno:

i had some theories.
In conversationabout 6 months ago from gnusocial.jppermalink

Public

Embed Notice

HTML Code

Corresponding Notice