Conversation
Notices
-
Embed this notice
iced depresso (icedquinn@blob.cat)'s status on Saturday, 22-Jun-2024 06:32:50 JST iced depresso
@tero it might just be working because most networks are too dense anyway. stanford's butterfly matrices capture more or less every standard transformation matrix. then there's the work in sparsity numenta's complimentary sparsity, or topkast. GMDH grew smaller networks incrementally. numenta's cortical models used to nod about how they could subset engrams and they still mostly worked. -
Embed this notice
Tero Keski-Valkama (tero@rukii.net)'s status on Saturday, 22-Jun-2024 06:32:52 JST Tero Keski-Valkama
I read this article and oh my god, are people doing PCA for reducing the dimensions of #LLM embeddings? I don't have any more polite way of saying it; that is pure stupidity.
No, these embeddings do not have principal dimensions! They span practically all the dimensions. Your dataset will just create an illusion that some dimensions are correlated when in reality they aren't.
Using PCA just shows people don't understand what these embeddings are.
Furthermore, people are using way too long embeddings. Using embeddings of over 1k dimensions will make all distances approximately equal, and rounding errors will start to dominate.
They compare their method with learning to hash methods and all kinds of misinformed methods which probably also use too long embedding vectors.
Separately they tested 8-bit quantization of their thousand-dimensional embedding vectors and found it performs better. I could have told them this beforehand; it's roughly equivalent to dimensionality reduction with a random projection matrix. And this works, better than PCA, because LLM embeddings are holographic. Reducing the dimensionality with a random projection is analogous to decreasing the resolution which is analogous to quantization.
But it works better if you have some supervised training set to rank the queries to results.
And in any case you don't want to vector search match queries to documents like everyone still keeps doing, but you want to generate oranges to oranges indices where you generate example queries for documents and match query embeddings to example query embeddings. Oranges to oranges.
-
Embed this notice