iced depresso (icedquinn@blob.cat)'s status on Wednesday, 22-May-2024 14:31:23 JST
-
Embed this notice
@gentoobro @hazlin well, transformers have awful performance numbers. mamba networks are subquadratic. they just struggle with if they don't admit facts in to their state window then they can't remember it at any point, since the whole thing with state space models is they have to evaluate what to carry forward and what to drop.
but none of this addresses the artificial hippocampus element, which is whats needed to actually make random ML shit in to a AI :blobcatdunno:
i had some theories.