Conversation
Notices
-
Embed this notice
snacks (snacks@netzsphaere.xyz)'s status on Sunday, 01-Feb-2026 23:44:16 JST
snacks
@meso retrieval augmented generation. You use vectorization to categorize parts of your text and can then draw up the closest matches and feed just those to an llm -
Embed this notice
meso (meso@new.asbestos.cafe)'s status on Sunday, 01-Feb-2026 23:44:17 JST
meso
@snacks rag? -
Embed this notice
meso (meso@new.asbestos.cafe)'s status on Sunday, 01-Feb-2026 23:44:19 JST
meso
@snacks is there an AI model you can feed large documents to and ask it questions about them. like i wanna be able to ask questions about the exact nature of the traffic laws here because it's very hard to read through that shit snacks repeated this. -
Embed this notice
snacks (snacks@netzsphaere.xyz)'s status on Sunday, 01-Feb-2026 23:44:19 JST
snacks
@meso you'll prob need some kind of rag setup of you want to query over your entire traffic law tbh. Maybe there's some rag in a box thing but i'm not aware of any -
Embed this notice
meso (meso@new.asbestos.cafe)'s status on Sunday, 01-Feb-2026 23:48:02 JST
meso
@snacks you couldnt? -
Embed this notice
snacks (snacks@netzsphaere.xyz)'s status on Sunday, 01-Feb-2026 23:48:02 JST
snacks
@meso it's production code at my company lmao -
Embed this notice
snacks (snacks@netzsphaere.xyz)'s status on Sunday, 01-Feb-2026 23:48:04 JST
snacks
@meso if i coupd i'd give you the rag tool i made for my finals -
Embed this notice
snacks (snacks@netzsphaere.xyz)'s status on Sunday, 01-Feb-2026 23:52:31 JST
snacks
@meso vectorization is performed by a seperate specialised ai model -
Embed this notice
roko's basilisk (vii@dsmc.space)'s status on Sunday, 01-Feb-2026 23:53:22 JST
roko's basilisk
@snacks @meso it might be more than you need but https://github.com/HKUDS/RAG-Anything a place to start digging -
Embed this notice
snacks (snacks@netzsphaere.xyz)'s status on Sunday, 01-Feb-2026 23:55:34 JST
snacks
@meso then you run your query through the same embedding model and get the closest matches in your db, combining bith embeddings and text search usually gives the best results, i think pgvector even has a good example how to combine them In conversation permalink -
Embed this notice
snacks (snacks@netzsphaere.xyz)'s status on Sunday, 01-Feb-2026 23:55:36 JST
snacks
@meso it's not that hard to implement yourself tbh, most of my time was spent wrestling file formats and shitty microsoft webservers.
Just figure out a way to cut your documents into small enough chunks with as much meaning as possibke in tact, run it through an embedding model and save the result into a database that can handle querying vectors with like 1000 dimensionsIn conversation permalink -
Embed this notice
protoss (nigger@detroitriotcity.com)'s status on Monday, 02-Feb-2026 00:03:04 JST
protoss
@snacks @meso the lion doesn't respect IP or NDAs In conversation permalink snacks likes this. -
Embed this notice
𝅙𝅙𝅙𝅙𝅙𝅙𝅙𝅙 (sally@freesoftwareextremist.com)'s status on Monday, 02-Feb-2026 00:29:55 JST
𝅙𝅙𝅙𝅙𝅙𝅙𝅙𝅙
@meso @snacks
> like i wanna be able to ask questions about the exact nature of the traffic laws here because it's very hard to read through that shit
Just learn to read.In conversation permalink snacks likes this.
-
Embed this notice