also filtered loma.ml and bsky.brid.gy (might bring loma back if anyone misses it and it's federation stabilizes, but saw almost 400 failed to re-fetch errors in the logs since 12:00, over 2000 for kemono :aaa: )
@nimda It looks like newsmast generates reposts, about once per minute, it is basically a relay. When ActivityPub server receives a repost, it attempts to retrieve the original post, and most if not all original posts are located at pubeurope.com. "Politics" channel is likely followed by many servers, so pubeurope is constantly under stress and becomes unresponsive from time to time. When it can't handle a request, it returns HTTP 502, which triggers a retry in Mitra.
I think queue congestion is caused by a combination of a high repost rate, slow server response and 502's. You can try to decrease federation.fetcher_timeout, 10-20 secs is okay for most cases (but might be not enough for tor and i2p). Filtering unresponsive servers like pubeurope is also a good idea.
Jul 02 01:40:36 wizard.casa mitra[1433501]: 2025-07-02T01:40:36 mitra_activitypub::queues [WARN] failed to process activity (HTTP status server error (502 Bad Gateway) for url (https://pubeurope.com/users/fr/statuses/114780187162115427)) (attempt #1): {"@context":"https://www.w3.org/ns/activitystreams","actor":"https://newsmast.community/users/politics","cc":["https://pubeurope.com/users/fr","https://www.w3.org/ns/activitystreams#Public"],"id":"https://newsmast.community/users/politics/statuses/114780189123638405/activity","object":"https://pubeurope.com/users/fr/statuses/114780187162115427","published":"2025-07-01T21:51:34Z","to":["https://newsmast.community/users/politics/followers"],"type":"Announce"}
I'm not sure how newsmast works and I should've just maybe filtered pubeurope.com? I see newsmast cc's other instances that throw up failures like threads.net, I'll remove the filter and see what happens
@silverpill Thanks, I looked up what the cc field was yesterday and that makes sense now. I was thinking of making a some kind of auto-filter script that monitors queue length, hits a threshold, looks at logs and filters problem instances. Maybe even have it poll the instance 5-10min if it's a 502 and remove the filter when it comes back up. Could also be possible to do something even better in mitra, if we hit a 502 move all activities that will hit that server in the queue to a secondary queue that does the slow poll until it's working and then moves everything back to the main queue? Just some thoughts, might be too much to maintain in the Mitra itself
@nimda A secondary queue for slow servers could help, but I think a more comprehensive solution is needed.
Currently all incoming activities are processed sequentially: there is a single worker that takes jobs from the queue. We can add more workers, and thus parallelize processing (= horizontal scaling).