The main limitation of SQLite is that it locks up when you try to write too much to it at once. This makes it "not a great use-case for social media". Not even due to your posts, but because of all the likes and emoji reactions you guys do. I still haven't hit any of those limits in my own experience. But writing to multiple database files could help it a lot, I bet.
@Moon The main problem I'm trying to solve is the home feed. So you would query events from all _users_ you follow. That means you would have to shard the databases by the author of the user rather than event ID... 🤔 Maybe that's it. But then what if you need the event by its ID?!
@ademan I wouldn't use Postgres again after how much of a pain it was in Pleroma. The better question is why not just use LMDB if you're going to go through all this trouble. They're equally hard, but at this rate LMDB would be easier.
Do you know what the motivation for this over just-using-postgres was? I’m confident you can beat postgres if you’re careful and thoughtful, but I’m also reasonably confident rolling your own system like this is at minimum going to be brittle, and probably have at least as many performance pitfalls at the end of the day.
@alex I don't quite understand, the event ids are all 256 bits, right? so can't you just take 100 servers or whatever and break up the keyspace into 100 sections and when you get a new event you just send it to that shard and query it the same way? the ids are cryptographic so they should scatter roughly equally across the nodes.
I wish I was this smart, but it's going to take some time before I can figure out how to make range queries do what I want and be able to manage all that shit.
It's a pure KV store but the keys are guaranteed to be in alphabetical order, so you have to build indexes and stuff manually around that concept and do "range queries" that seek all keys from a start til and end value.
@alex@ademan I have to admit this is more interesting technically than activitypub json-ld fucking shit and just jamming json into a relational database and dumbly replicating.
Cluster of sqlite databases sounds like they have enough partitions that there's no need to parallelize any single db and sqlite is just more light weight so they can run more of them on the same machine.
@Moon@alex@ademan The biggest selling point of Nostr to a developer is its simplicity. If you can open a websocket, send commands, and receive and parse responses in JSON format, then the rest is just protocol-ism and storing things as efficiently as possible.
@alex I think that is why u need kafka or sth like that before DB. There is no way to optimize it for peaks... currently I am working on the same problem and wasted few weeks for scaling optimizing, tests and the solution is one add this element to process data. I am also going to tokenize some data to process faster...
@alex Another advantage is that streaming module is separate and can be extended, optimised and also scaled... twitter used this as eventbus or sth like that...