its either that or i should just learn sql . but since i have no vices and i am perfect,
RE: https://brain.worm.pink/objects/fcd9f101-3ee0-4e73-a16d-d3d9dc74f80e
Conversation
Notices
-
Embed this notice
φ (fiore@brain.worm.pink)'s status on Tuesday, 25-Feb-2025 15:53:38 JST φ
- ✙ dcc :pedomustdie: :phear_slackware: likes this.
-
Embed this notice
snacks (snacks@netzsphaere.xyz)'s status on Tuesday, 25-Feb-2025 15:53:37 JST snacks
@fiore sql doesn't help you with pleroma -
Embed this notice
snacks (snacks@netzsphaere.xyz)'s status on Tuesday, 25-Feb-2025 15:57:52 JST snacks
@fiore postgres, but almost everything is stored in json, not in tables and row -
Embed this notice
φ (fiore@brain.worm.pink)'s status on Tuesday, 25-Feb-2025 15:57:53 JST φ
@snacks what db does pleroma use
-
Embed this notice
Theory_Of_Everything (kirby@annihilation.social)'s status on Tuesday, 25-Feb-2025 15:59:00 JST Theory_Of_Everything
@snacks @fiore :chirumiru_cirno_dance:
the efficiency of such a schema is widely disputedsnacks likes this. -
Embed this notice
✙ dcc :pedomustdie: :phear_slackware: (dcc@annihilation.social)'s status on Tuesday, 25-Feb-2025 16:03:50 JST ✙ dcc :pedomustdie: :phear_slackware:
@kirby @snacks @fiore You just need to know json and postgres....... WAIT A SECOND -
Embed this notice
Phantasm (phnt@fluffytail.org)'s status on Tuesday, 25-Feb-2025 16:05:12 JST Phantasm
@snacks @fiore It makes sense compared to normalizing every object to 30 columns every time it comes in, or needs to go out. The big downside is the huge size of the tables and indexes needed for it to work quickly. With indexes, the performance is about the same as a normal DB schema. ✙ dcc :pedomustdie: :phear_slackware: likes this. -
Embed this notice
Theory_Of_Everything (kirby@annihilation.social)'s status on Tuesday, 25-Feb-2025 16:05:27 JST Theory_Of_Everything
@dcc @fiore @snacks but like
you have to go through every single row of the database, unmarshal the json, look at the contents of the json to find what you're looking for if you're making a query......... that's hardly efficient✙ dcc :pedomustdie: :phear_slackware: likes this. -
Embed this notice
✙ dcc :pedomustdie: :phear_slackware: (dcc@annihilation.social)'s status on Tuesday, 25-Feb-2025 16:05:54 JST ✙ dcc :pedomustdie: :phear_slackware:
@kirby @fiore @snacks You have to spit something some way, maybe i could think of a better way my self (my vps biz is just going to use sqlite lol) -
Embed this notice
Theory_Of_Everything (kirby@annihilation.social)'s status on Tuesday, 25-Feb-2025 16:08:11 JST Theory_Of_Everything
@dcc @fiore @snacks you could just have a traditional database schema set up you don't have to store json inside the database like pleroma does ✙ dcc :pedomustdie: :phear_slackware: likes this. -
Embed this notice
snacks (snacks@netzsphaere.xyz)'s status on Tuesday, 25-Feb-2025 16:08:15 JST snacks
@phnt @fiore i didn't even say anything about whether that's good or not, just that knowing sql won't get you far -
Embed this notice
Phantasm (phnt@fluffytail.org)'s status on Tuesday, 25-Feb-2025 16:11:45 JST Phantasm
@kirby @dcc @fiore @snacks The JSON for Pleroma is stored as a binary format, there's no need for parsing it once it is in the DB. The overhead is at insertion-time. Once it's in the DB, Postgres has it's own representation of it, that it can understand without re-parsing it all the time. It does not store whitespace in the JSON itself, it does not store duplicate keys in the JSON.
To skim over the details, it's basically stored in the already parsed representation. To simplify even more, imagine Postgres ran jq on the JSON and stored the internal representation of the JSON made by jq.✙ dcc :pedomustdie: :phear_slackware: likes this. -
Embed this notice
✙ dcc :pedomustdie: :phear_slackware: (dcc@annihilation.social)'s status on Tuesday, 25-Feb-2025 16:13:00 JST ✙ dcc :pedomustdie: :phear_slackware:
@phnt @kirby @fiore @snacks Btw do you know why kirbs dumps don't seem to have 20 gb of data? (activates and notifications) -
Embed this notice
✙ dcc :pedomustdie: :phear_slackware: (dcc@annihilation.social)'s status on Tuesday, 25-Feb-2025 16:15:11 JST ✙ dcc :pedomustdie: :phear_slackware:
@kirby @fiore @phnt @snacks Separate data and schema -
Embed this notice
Theory_Of_Everything (kirby@annihilation.social)'s status on Tuesday, 25-Feb-2025 16:15:12 JST Theory_Of_Everything
@dcc @phnt @fiore @snacks oh yeah we've been having this problem where we have a 40 gb data dump but only 20 gb of data ends up inside postgres and there seem to be no activities inside the imported database when we start up pleroma
i also had this issue with fre but that was solved by making a custom format postgres dump and waiting 2 hours for the entire thing to import into another machine, so i thought nothing of it. but we tried that here and after 1 or 2 days of importing the entire thing we ended up with the same problem -
Embed this notice
✙ dcc :pedomustdie: :phear_slackware: (dcc@annihilation.social)'s status on Tuesday, 25-Feb-2025 16:16:36 JST ✙ dcc :pedomustdie: :phear_slackware:
@phnt @kirby @fiore @snacks This time (the second time) with seprate schema and data (the data is the right size) it only is 20gb in postgres. -
Embed this notice
Phantasm (phnt@fluffytail.org)'s status on Tuesday, 25-Feb-2025 16:16:37 JST Phantasm
@dcc @kirby @fiore @snacks The dump of a Pleroma DB without indexes is around 50% of the full size in Postgres. I don't know how big the original DB is. And considering the really long time it took to restore, I also suspect an incorrect dump. Perhaps with triggers not disabled. -
Embed this notice
✙ dcc :pedomustdie: :phear_slackware: (dcc@annihilation.social)'s status on Tuesday, 25-Feb-2025 16:19:03 JST ✙ dcc :pedomustdie: :phear_slackware:
@phnt @kirby @fiore @snacks You see whats werid
screenshot_25_February_24_23-17-16.png
screenshot_25_February_24_23-18-45.png -
Embed this notice
Phantasm (phnt@fluffytail.org)'s status on Tuesday, 25-Feb-2025 16:19:04 JST Phantasm
@dcc @kirby @fiore @snacks Are indexes built properly? Did psql spit out any errors when ingesting?
I don't see a simple reason as to why it would by so small after an import. Even FluffyTail is bigger than 20GB after 1.5 years. -
Embed this notice
Phantasm (phnt@fluffytail.org)'s status on Tuesday, 25-Feb-2025 16:24:26 JST Phantasm
Did psql spit out any errors when ingesting?
I suspect this, if psql was not ran with -v ON_ERROR_EXIT=1. It could have thrown an error, skipped a big table (likely activities or objects) and continued the import. There should be zero errors coming from psql and the only exception to that is psql complaining about a different version of Postgres used for the dump, than what is running currently on the import. In that case, it can be ignored and ran without -v ON_ERROR_EXIT=1, but there still shouldn't be any other warnings or errors.
✙ dcc :pedomustdie: :phear_slackware: likes this. -
Embed this notice
Phantasm (phnt@fluffytail.org)'s status on Tuesday, 25-Feb-2025 16:26:19 JST Phantasm
@kirby @dcc @fiore @snacks That's a big fail, as the foreign key cannot find an activity with that id in the activities table. At least that's what I think is happening. The activities table might be incomplete after an import. snacks and ✙ dcc :pedomustdie: :phear_slackware: like this. -
Embed this notice
Theory_Of_Everything (kirby@annihilation.social)'s status on Tuesday, 25-Feb-2025 16:26:20 JST Theory_Of_Everything
@phnt @dcc @fiore @snacks there was this one error during the import process but this is likely being spit out cause i fucked with the database by bookmarking an object and then deleting the object from the db last year
psql:p_data.sql:28793676: ERROR: insert or update on table "bookmarks" violates foreign key constraint "bookmarks_activity_id_fkey"
DETAIL: Key (activity_id)=(00000189-c415-0f21-5936-3173ee4a0000) is not present in table "activities". -
Embed this notice
snacks (snacks@netzsphaere.xyz)'s status on Tuesday, 25-Feb-2025 16:26:59 JST snacks
@kirby @dcc @fiore @phnt idk if that's actually the case, but it might prevent shit from being imported since fk dependencies aren't met.
Don't take my word for it tho, i know jackshit about the pleroma db apart from the json thing -
Embed this notice
Theory_Of_Everything (kirby@annihilation.social)'s status on Tuesday, 25-Feb-2025 16:32:59 JST Theory_Of_Everything
@graf @dcc @fiore @phnt @snacks i have the original database on hand still -
Embed this notice
anime graf mays ?️? (graf@poa.st)'s status on Tuesday, 25-Feb-2025 16:32:59 JST anime graf mays ?️?
@kirby @dcc @fiore @phnt @snacks https://annihilation.social/objects/3ed8dd6a-3446-462a-9eea-2a1c9e8d4a63
was there any error spit out here? we had to restore poast from a backup once after gleason added editing to rebased and it took 13 hours (and I had a whole nights sleep) before I realized you could run it with --jobssnacks likes this. -
Embed this notice
anime graf mays ?️? (graf@poa.st)'s status on Tuesday, 25-Feb-2025 16:33:00 JST anime graf mays ?️?
@phnt @dcc @kirby @fiore @snacks yeah he's 100% missing half his dump. kirby do you have access to the original data or just this backup? -
Embed this notice
翠星石 (suiseiseki@freesoftwareextremist.com)'s status on Tuesday, 25-Feb-2025 20:37:13 JST 翠星石
@snacks @fiore >almost everything is stored in json, not in tables and row
>Have database.
>Don't use it at you should use databases - just JSON encode everything, killing the performance.