Conversation
Notices
-
Embed this notice
@mint @lina how big are your databases + how many years have they been active? (curious)
-
Embed this notice
I'm about 100gb at a little over two years. I really want to strip it back to nothing, and let it grow for another two years. I was wondering how to do that though, e.g. trim the db down to a week of posts and vacuum full? Not sure that would work though.
-
Embed this notice
@iamtakingiteasy @lina @mint wtf I thought my instance was too fat now I learn it is not fat enough
-
Embed this notice
@lina @cassidyclown @mint Currently pleroma DB is 130 gigs, with daily growth averaging at about 200+ megabytes. The size is also somewhat inflated by FTS indices. Have been running since February of 2021 for ~2+ years now and have survived domain migration.
-
Embed this notice
@cassidyclown @mint i do? i'm humbled, not gonna lie, but i just have fun on eientei
cc @iamtakingiteasy
-
Embed this notice
@lina @mint oshit I thought you were the admin sorry you have admin energy
-
Embed this notice
@cassidyclown @mint i dunno how big the database is or how long the instance has been running :ryukofuckedup:
-
Embed this notice
Does the prune feature effect the result of the vacuum full? Rough math, would say pruning to 30 days vs. 90 days result in a database one-third the size after vacuum full.
-
Embed this notice
The vacuum command got my db down from like 45gb to 15gb
-
Embed this notice
@Humpleupagus @cassidyclown @iamtakingiteasy @lina @mint You should do repack,its better than vacuuming and your instance stays online.
-
Embed this notice
Doesn't repack still need full table lock, but just does one table at a time?
-
Embed this notice
Does pruning actually remove the remote activities etc?
-
Embed this notice
@Humpleupagus @cassidyclown @lina @mint You can partition the table by a condition (e.g. < some date) and move that partition to larger drive as a sort of cold-but-still-transparently-accessible storage. Relevant if your main DB storage is on SSD and you want to keep historical statuses on HDD. Otherwise you can indeed just outright remove the remote activities/related objects you're not interested in (the ones that do not reference any of your local activities/objects/users).
-
Embed this notice
Would you mind giving he an example of a sql delete command that would remove all activities older than 30 days? It would be appreciated. I've never used that feature, but I think seeing it would help me research it and get an idea of how it works before I do it. I don't want to fuck my DB up like a noob.
-
Embed this notice
@Humpleupagus @cassidyclown @lina @mint removing them with sql delete marks them as unused and subsequent vacuum full would rebuild tables without unused log entries. But it is a blocking operation and can take considerable time.
-
Embed this notice
i didnt specify a prune 30 or a prune 90 i just did the db vaccum.. (im sure i copy pasted it off graf) and the damn thing bought me enough space to give me time to figure out how i was going to up the storage.
digital ocean bundles it... if i was smart i would have given storage its own mount point as thats cheaper to upgrade....
add this to one of a hundred regrets i have experienced in my life
-
Embed this notice
Can't you stop the server and database, rsync the postgres directory to a remote file system storage, rename the local postgres directory (temp backup just invcase), mount the remote file system at a directory with the same name/permissions as the original postgres directory, and turn it all back on?
-
Embed this notice
I jack it... like a lot.
-
Embed this notice
That is quite the collection of furry porn
-
Embed this notice
It's not a backup technically. It's an "oh shit. I can get this back up by just renaming a directory." Saves the need to dump. You can get the server up immediately after mount... I think. If it comes up, you can delete the renamed directory.
-
Embed this notice
@Humpleupagus @cassidyclown @iamtakingiteasy @lina @zero @mint If you want to back up a db, just dump it lol.
-
Embed this notice
@Humpleupagus @cassidyclown @iamtakingiteasy @lina @zero @mint I mean yea you could copy your postgres folder to another drive
-
Embed this notice
There's 99 ways to skin a cat.
-
Embed this notice
Just use your trunk to suck it out of your ass. It's not perpetual motion because we all still feed you more shit.
-
Embed this notice
I'm trying. Trust me.
-
Embed this notice
I remember that. I recently removed all media not accessed in the last 60 days. I moved it at first (just in case) and eventually deleted it. I had no problems.
-
Embed this notice
i watched matty switch the media directory to AWS and every NC user lost their profile pic and banner pic and had to reupload their stuff if they wanted it to federate. this was maybe 6 months ago
-
Embed this notice
You didn’t change the pointer though
-
Embed this notice
No. And I figured that by using access date, not creation date, active user pfps, etc., would be preserved.
-
Embed this notice
Thank you. ❤️
-
Embed this notice
@Humpleupagus @cassidyclown @lina @mint Aside from 30 days interval and direct references, you may also want to consider following relationships of local users,
with
follows as (
select distinct
ap_id
from following_relationships join users on users.id = following_relationships.following_id
),
ids as (
...
join objects on
...
not (
... or
exists (select 1 from follows where activities.data::text like '%' || ap_id || '%' or objects.data::text like '%' || ap_id || '%')
)
)
...
But it would be very slow. Checking specific fields, e.g. actor/announces/from/to/cc/bcc and using array operators can improve the performance, but it can still be a lot of relations to process for each status.
-
Embed this notice
@Humpleupagus @cassidyclown @lina @mint
Something along the lines of
with
ids as (
select
activities.id as acitivity_id,
objects.id as object_id
from activities
join objects on
objects.data->>'id' = associated_object_id(activities.data) and
not (
activities.data::text like '%eveningzoo.club%' or
objects.data::text like '%eveningzoo.club%' or
greatest(activities.inserted_at, activities.updated_at, objects.inserted_at, objects.updated_at) >= now()-'30 days'::interval
)
),
del_activities as (
delete from activities where id in (select acitivity_id from ids)
)
delete from objects where id in (select object_id from ids)
But setup a DB and instance copy first to test it thoroughly without affecting your live instance.