Conversation

Notices

Embed this notice
Tij (tij@eientei.org)'s status on Sunday, 25-Aug-2024 04:39:06 JST Tij
- nukie
@nukie you son of a fucking bitch why did you not leave me a spot
In conversation about a year ago from eientei.org permalink
Attachments
1. Untitled attachment
  https://eientei.org/media/70dac51020bceba399880d0d9790eedb8b2f71b08ff6afc58dc37861982904c6.jpg?name=20240824_223121.jpg
- Embed this notice
  (mint@ryona.agency)'s status on Sunday, 25-Aug-2024 04:39:03 JST
  in reply to
  - nukie
  @Tij @nukie Seems to still have his old Asuka avatar.
  
  In conversation about a year ago permalink
- Embed this notice
  Tij (tij@eientei.org)'s status on Sunday, 25-Aug-2024 04:39:05 JST Tij
  in reply to
  - nukie
  @nukie literally where the fuck did this dude go, he switched his xmpp pfp to hello kitty or some shit Im scared that he actually went trans for real this time
  
  In conversation about a year ago permalink
- Embed this notice
  (mint@ryona.agency)'s status on Sunday, 25-Aug-2024 04:42:10 JST
  in reply to
  - nukie
  @Tij @nukie Anyway, at the very least his instance was useful as a guinea pig for debugging frozen federation queue problem.
  
  In conversation about a year ago permalink
- Embed this notice
  Tij (tij@eientei.org)'s status on Sunday, 25-Aug-2024 04:42:24 JST Tij
  in reply to
  - nukie
  @mint @nukie he has multiple accounts
  In conversation about a year ago permalink
  Attachments
  1. Untitled attachment
    https://eientei.org/media/d8da4ee015021259e14db92320f7a4dd97ebb4c6a2a4bbb26a0d76ba39569484.jpg?name=Screenshot_20240824_224143_Cheogram.jpg
  likes this.
- Embed this notice
  (mint@ryona.agency)'s status on Sunday, 25-Aug-2024 04:44:03 JST
  in reply to
  - nukie
  @Tij @nukie Won't say it's beyond repair, but I don't want to kick it back until the new Oban release drops and gets pulled as dependency in Pleroma.
  
  In conversation about a year ago permalink
- Embed this notice
  Tij (tij@eientei.org)'s status on Sunday, 25-Aug-2024 04:44:04 JST Tij
  in reply to
  - nukie
  @mint @nukie true, its broken beyond repair now doe
  
  In conversation about a year ago permalink
- Embed this notice
  (mint@ryona.agency)'s status on Monday, 26-Aug-2024 02:10:36 JST
  in reply to
  - feld
  - nukie
  @Tij @nukie Actually I'm now tired of waiting plus I want to see how it performs now, so I merged the develop branch and changed oban dependency to be pulled from git, will update the instances later.
  Also @feld, apparently adding Lazarus plugin broke accessing DB config. adminfe would just show the empty settings dropdown while nu-PleromaFE throws TypeError: n.tuple is undefined. This doesn't happen after switching to develop branch, and I don't see anything out of place in /api/pleroma/admin/config aside from that. Nothing too critical, just had to change config.exs instead to reduce liability of a hostile actor on one of the instances.
  Screenshot_20240825_195527.png
  In conversation about a year ago permalink
  Attachments
  1. Screenshot_20240825_195527.png
    https://ryona.agency/media/67bb65f0ee66d77ebef0b4ff9914c80e5039dfb48e71ea88987a5091bcf49a60.png?name=Screenshot_20240825_195527.png
- Embed this notice
  Phantasm (phnt@fluffytail.org)'s status on Monday, 26-Aug-2024 02:32:54 JST Phantasm
  in reply to
  - feld
  - nukie
  @mint @feld
  I updated to the latest Oban git yesterday and it survived a DB repack which is an improvement I guess. Other actions that would crash Pleroma (related to the 502 gateway issue; that issue is a special case of this) no longer seem to do so.
  
  Today Husky crapped out on me probably because it couldn't auth over the dropped API requests coming through db_connection. Increasing queue_target and queue_interval did help with that, but it might have other side effects.
  
  It's too early to tell if the newer Oban helps with the stalling federation. At least the performance isn't worse.
  @nukie @Tij
  
  In conversation about a year ago permalink
  
  likes this.
- Embed this notice
  (mint@ryona.agency)'s status on Monday, 26-Aug-2024 02:37:20 JST
  in reply to
  - feld
  - nukie
  - Phantasm
  @phnt @feld @nukie @Tij >It's too early to tell if the newer Oban helps with the stalling federation
  Description of the commit is a fairly close match to the symptoms we observed (Oban unaliving itself after receiving too many DB timeouts), so we'll see.
  
  In conversation about a year ago permalink
- Embed this notice
  (mint@ryona.agency)'s status on Monday, 26-Aug-2024 03:07:45 JST
  in reply to
  - feld
  - nukie
  - Phantasm
  @phnt @Tij @feld @nukie Oh, it crashed in entirety, but got restarted by healthcheck script. Nothing in logs except hundreds of DBConnection.ConnectionErrors, but there's also a bunch of "ERROR 57014 (query_canceled)" and "unknown registry: Pleroma.Web.StreamerRegistry".
  
  In conversation about a year ago permalink
- Embed this notice
  Phantasm (phnt@fluffytail.org)'s status on Monday, 26-Aug-2024 03:26:57 JST Phantasm
  in reply to
  - feld
  - nukie
  @mint @feld @nukie @Tij Yeah that's the issue I usually run into.
  I have no idea how the supervisor tree in Pleroma looks like, but my theory is that after enough db_connection errors, the error slowly goes up and eventually reaches Pleroma's own supervisor. The maximum number of restarts is set to 3 in the default config and after that is exceeded, it exits and init restarts it.
  There's a somewhat rare case when the Pleroma application completely shuts down, but the system process itself still exists and therefore it doesn't get restarted by init. That's the issue I talked about.
  
  In conversation about a year ago permalink
  
  likes this.
- Embed this notice
  (mint@ryona.agency)'s status on Monday, 26-Aug-2024 03:34:57 JST
  in reply to
  - feld
  - nukie
  - Phantasm
  @phnt @feld @nukie @Tij Huh, never noticed that option.
  https://git.pleroma.social/pleroma/pleroma/-/blob/develop/config/config.exs?ref_type=heads#L925
  Might as well set it to 99 or something.
  In conversation about a year ago permalink
  Attachments
  1. Domain not in remote thumbnail source whitelist: git.pleroma.social
    
    config/config.exs · develop · Pleroma / pleroma · GitLab
    
    Pleroma backend
- Embed this notice
  feld (feld@bikeshed.party)'s status on Monday, 26-Aug-2024 05:25:12 JST feld
  in reply to
  @phnt @nukie @Tij @mint if we can figure out what gets caught in the fast crash loop we can change the way it starts that service to prevent that from crippling the app
  
  In conversation about a year ago permalink
  
  likes this.
- Embed this notice
  Phantasm (phnt@fluffytail.org)'s status on Monday, 26-Aug-2024 06:00:39 JST Phantasm
  in reply to
  - feld
  - nukie
  @feld @nukie @Tij @mint
  
  It's very hard to tell because even loading FE sometimes floods logs with DBConnection errors. Currently I have no way of at least somewhat reliably causing the crash.
  
  There's one log in one of the other threads that at least partially crippled Pleroma into not listening on any ports and didn't cause a restart.
  https://fluffytail.org/notice/Al1dQDXk8Erhmg31sW
  
  I'll look through my logs tomorrow for a proper log where Pleroma exited completely.
  In conversation about a year ago permalink
  Attachments
  1. Domain not in remote thumbnail source whitelist: upload.fluffytail.org
    
    Phantasm (@phnt@fluffytail.org)
    
    @feld @mint @phnt Pleroma crashed again ~1 minute after I made a post. federator_incoming queue had 0 available jobs, and few retryable. federator_outgoing had 7 failed jobs and zero available/exec...
  likes this.
- Embed this notice
  Phantasm (phnt@fluffytail.org)'s status on Sunday, 01-Sep-2024 06:15:57 JST Phantasm
  in reply to
  - feld
  - nukie
  - Phantasm
  - feld
  @feld Sorry for the delay. I've looked through my logs and the last time Pleroma shut down and restarted was 11 days (no crash or stalled federation since then) when I was still running Oban 2.13.6.
  
  The logs are mostly the same as the ones in the other thread linked above. A lot of "connection not available and request was dropped from queue after X ms" or an occasional "connection closed by the pool, possibly due to a timeout..." messages coming from db_connection with even rarer "cancelling statement due to user request" messages coming from postgrex.
  
  The db_connection errors always come in big batches usually when disk iowait increases which isn't under my control. It's not caused by Postgres autovacuum as that runs much more frequently.
  
  @feld @Tij @mint @nukie
  
  In conversation about a year ago permalink
  
  and feld like this.

Public

Conversation

Notices

Feeds