Embed Notice

HTML Code

<blockquote style="position: relative; padding-left: 55px;"><section><a href="https://fluffytail.org/objects/71630088-96d2-4df3-836a-9d1ec26d4c0d">Phantasm (phnt@fluffytail.org)'s status on Friday, 16-Aug-2024 23:24:15 JST</a><a href="https://fluffytail.org/users/phnt" title="phnt@fluffytail.org"><img src="https://gnusocial.jp/avatar/158739-48-20230808204705.webp" width="48" height="48" alt="Phantasm" style="position: absolute; left: 0; top: 0;">Phantasm</a><div><a href="https://gnusocial.jp/notice/6912247" rel="in-reply-to">in reply to</a><ul><li><li><a href="https://gnusocial.jp/user/4193" title="feld@bikeshed.party">feld</a></li><li><a href="https://gnusocial.jp/user/158739" title="phnt@fluffytail.org">Phantasm</a></li><li><a href="https://gnusocial.jp/user/276057" title="phnt@annihilation.social">Phantasm</a></li></ul></div></section><article><p><a href="https://bikeshed.party/users/feld">@feld</a> <a href="https://ryona.agency/users/mint">@mint</a> <a href="https://annihilation.social/users/phnt">@phnt</a> Pleroma crashed again ~1 minute after I made a post. federator_incoming queue had 0 available jobs, and few retryable. federator_outgoing had 7 failed jobs and zero available/executing.</p><p>Same thing just like last time. Out of nowhere a jump in disk backlog for a minute, disk busytime and Pleroma DB locks. Had almost zero DB timeouts before that.</p><p>Before the crash a lot of (DBConnection.ConnectionError) connection not available and request was dropped from queue after &lt;some number&gt;ms. This means requests are coming in and your connection pool cannot serve them fast enough. showed up in logs. Pleroma used at maximum 12 DB connections. Number of connections or pool size are from the default config, only :pleroma :connections_pool, connect_timeout was increased to 10s from default 5s. :pleroma, Pleroma.Repo, timeout was also increased to 30s.</p><p>The Netdata screenshots are from the same time. Ignore the time difference. Server is UTC-4 (US ET) and Netdata is UTC+2 (CEST).</p><br><a href="https://upload.fluffytail.org/media/6c5d8ebbe3294ea50fb57d83c1122139b16e13f666acc26bff03423a525240a7.txt?name=pleroma-crash_20240816.txt">pleroma-crash_20240816.txt</a><br><a href="https://upload.fluffytail.org/media/8291bec66b09da3ab32f89e7bda2359f1bee367965116fabfefc0870c89e6c3a.txt?name=postgres-crash_20240816.txt">postgres-crash_20240816.txt</a><br><a href="https://upload.fluffytail.org/media/b3b3dbe728d025af3124231a1d4efb0f23bfa5c6a81e7ae4c69ee4a7e0fd9f60.png?name=image.png"></a><br><a href="https://upload.fluffytail.org/media/f432a1f85c6eb2a31db3ed641b09577b546546ff32ac532b679a9d1035e4286c.png?name=image.png"></a></article><footer><a rel="bookmark" href="https://gnusocial.jp/conversation/3498084#notice-6928254">In conversation</a><time datetime="2024-08-16T23:24:15+09:00" title="Friday, 16-Aug-2024 23:24:15 JST">about 8 months ago</time> <span>from <span><a href="https://fluffytail.org/objects/71630088-96d2-4df3-836a-9d1ec26d4c0d" rel="external" title="Sent from fluffytail.org via ActivityPub">fluffytail.org</a></span></span><a href="https://fluffytail.org/objects/71630088-96d2-4df3-836a-9d1ec26d4c0d">permalink</a><h4>Attachments</h4><ol><li><label><a rel="external" href="https://gnusocial.jp/attachment/3049284">pleroma-crash_20240816.txt</a></label><br><div>Invalid filename.</div></li><li><label><a rel="external" href="https://gnusocial.jp/attachment/3049285">postgres-crash_20240816.txt</a></label><br><div>Invalid filename.</div></li><li><label><a rel="external" href="https://gnusocial.jp/attachment/3049286">Untitled attachment</a></label><br><a href="https://upload.fluffytail.org/media/b3b3dbe728d025af3124231a1d4efb0f23bfa5c6a81e7ae4c69ee4a7e0fd9f60.png?name=image.png" rel="external">https://upload.fluffytail.org/media/b3b3dbe728d025af3124231a1d4efb0f23bfa5c6a81e7ae4c69ee4a7e0fd9f60.png?name=image.png</a></li><li><label><a rel="external" href="https://gnusocial.jp/attachment/3049287">Untitled attachment</a></label><br><a href="https://upload.fluffytail.org/media/f432a1f85c6eb2a31db3ed641b09577b546546ff32ac532b679a9d1035e4286c.png?name=image.png" rel="external">https://upload.fluffytail.org/media/f432a1f85c6eb2a31db3ed641b09577b546546ff32ac532b679a9d1035e4286c.png?name=image.png</a></li></ol></footer></blockquote>

Corresponding Notice

Embed this notice
Phantasm (phnt@fluffytail.org)'s status on Friday, 16-Aug-2024 23:24:15 JSTPhantasm
in reply to
@feld @mint @phnt Pleroma crashed again ~1 minute after I made a post. federator_incoming queue had 0 available jobs, and few retryable. federator_outgoing had 7 failed jobs and zero available/executing.
Same thing just like last time. Out of nowhere a jump in disk backlog for a minute, disk busytime and Pleroma DB locks. Had almost zero DB timeouts before that.
Before the crash a lot of (DBConnection.ConnectionError) connection not available and request was dropped from queue after <some number>ms. This means requests are coming in and your connection pool cannot serve them fast enough. showed up in logs. Pleroma used at maximum 12 DB connections. Number of connections or pool size are from the default config, only :pleroma :connections_pool, connect_timeout was increased to 10s from default 5s. :pleroma, Pleroma.Repo, timeout was also increased to 30s.
The Netdata screenshots are from the same time. Ignore the time difference. Server is UTC-4 (US ET) and Netdata is UTC+2 (CEST).

pleroma-crash_20240816.txt
postgres-crash_20240816.txt
In conversationabout 8 months ago from fluffytail.orgpermalink
Attachments

Public

Embed Notice

HTML Code

Corresponding Notice