Another important pleroma security post: @alex and @graf found ANOTHER injection bug, and this one was probably used for the attack. I think that single user instances are probably not affected, but I wouldn't want to risk it. Move your media and proxy to a subdomain as alex initially recommended, it's not complicated and takes 15 minutes, and eliminates this whole class of bugs.
Fix is being worked on, but just do the media/proxy thing now so you'll never have to worry about this again.
@lain if I just changed media to use subdomain media.shpposter.club, which maps to the same server, without configuring any proxy stuff, doesn't that do the trick already? all media you'll see on the tl will have a different domain
@lain@lain.com i think i'm alright with my single user instance, i never enabled the media proxy and local uploads were setup with object storage from the very start
@crunklord420@mint@lain@graf You're misunderstanding the problem. The issue is that the mime detection is too good. We get back application/javascript instead of text/plain for uploaded JS files, leading to injection possibility without needing to bypass CSP in any way.
@mint@alex@lain@graf lmao "displaying images on a website is too complicated and it keeps injecting javascript so just host all your images on a different domain instead".
I wrote my own MIME detection crap in Rust in a few hours, optional ffmpeg integration for codec detection. Apparently no Elixir dev can do this.
@lain@alex@graf >Move your media and proxy to a subdomain Yeah I'm not doing that. There are six mirrors across different networks, all of which would need to have subdomains configured somehow, even the one that is a plain IPv6 without domain (moving it to different port like I did with bloat?). Old media would still dangle in the same dir unless you introduce more overhead by putting redirects. Speaking of media, here's my setup: >mediaproxy is disabled as it doesn't play well with upstream proxies, the state of HTTP adapters in Erlang/Elixir is abysmal and you all know it >nginx serves media directly from Pleroma's upload dir, adding sandbox CSP by itself and bypassing Cowboy, Oban and other shit >since nginx doesn't analyze file contents, it sends the MIME type that is corresponding to extension, so you can't load js file uploaded as txt because it'll be text/plain or octet-stream (don't remember if that's also a default pleroma behavior or not) >as for .js uploads themselves, they all return 403, that was one of the first things I did after the initial hack So far I don't see how it can be exploited if there's no way to access any scripts that aren't part of frontend, due to the basic 403, CORS/CSP block on subdomain or otherwise.
@Moon@alex@lain@graf@mint I don't care this is insane. No one is saying move UGC text to a different domain. No, it's the _SAFE_ stuff that should be moved to a different domain. Images are literally the easy part, sanitizing text is the hard part but it's a solved problem if you use someone else's library.
I'm tired of seeing webshitters pretend like they're real devs while working in soy languages with soy frameworks and in the end they're very proud of themselves for displaying text and images on a webpage poorly, in under 4GB of per-tab RAM usage.
Take your L and don't tell me to move shit to a different domain. Don't pretend like you wrote this down anywhere in the docs, because you didn't, no, it's just retroactive cope.
@crunklord420@alex@lain@graf@mint it is exceptionally hard to move text to a separate domain and have buttons and things be located proximally to the content, otherwise I would agree with you. also one of the exploits is in fact because SVGs can have links to scripts (SVG was a mistake)
@crunklord420@alex@graf@lain@mint I agree btw that nobody (afaik) said to do this before now. Well, I did but I didn't take my own advice for mediaproxy, but only because I thought it would be harder than it was because I didn't see the option in the docs (maybe that is because I am stuipid though.)
@swastika@alex@lain@graf@mint@Moon bro you don't know shit about code don't even pretend. Rust is because because of cargo and that it's slow? Elixir is a high abstraction bytecode garbage collected virtual machined spook language.
@n-2-l@lain@graf@crunklord420@mint@Moon I'm bridging two entire decentralized social media protocols on a single thread of TypeScript code in Deno, serving about 4GB of data per day, and the whole VM including my code and OS are consuming about 400MB of RAM and maybe 10% of the CPU.
@alex@lain@graf@crunklord420@mint@Moon every single javascript engine is slow as shit,nodejs is written and C and uses libuv underneath.it doesnt matter if the runtime is written in rust or whatever because the js that runs on top of it requires gigabytes of RAM to do anything because of so many layers of garbage thats tangled together.
@alex@lain@graf@crunklord420@mint@Moon you could probably do a bigger public service by writing down what the actual fuck is meant to be sent over the wire in plain markdown nostr nip style for this AP retardation we're stuck with
@crunklord420@lain@graf@n-2-l@mint@Moon Elixir is built specifically to use 100% on all cores, without the developer having to work so hard to achieve it. That's why it's a high level abstraction spook, as you said.
@alex@lain@graf@n-2-l@mint@Moon almost everything I run is never CPU bottlenecked, and that's a bad thing. They're always IO bottlenecked, pleroma, matrix, everything. The CPU is literally just sleeping while waiting for data to move from disk to ram to CPU. If the software was good that data would be in CPU cache, it would be in memory, it wouldn't be on disk. It wouldn't be constantly doing system calls to the kernel and sleeping on mutexes.
If the software was good, it'd be using 100% on all cores.
@alex@lain@graf@crunklord420@mint@Moon my C++ backend that uses an efficient coroutine pool design from Yandex's userver framework can serve 100k nontrivial requests(involving DB shit) per second while using 100 megabytes of RAM.
@alex@lain@graf@crunklord420@n-2-l@mint there is room in programming for all God's children, including the hardcore rust programmers and time-saving super-productive deno programmers. there is not room for react programmers however.
@Moon@lain@graf@crunklord420@n-2-l@mint The thing we really need is better databases anyway. I think you could write a Fediverse server in Chef and not suffer from performance problems, but we really need a database that's fast, easy to maintain, easy to delete things from, easy to index, easy to move, and supports full text search. I want LMDB but for someone else to have already made this stuff really easy.
@alex@lain@graf@n-2-l@mint@Moon Pleroma doesn't cache enough. Like I said the bottleneck is IO. I struggle to get it to use more memory, I tune all the lame VM options and yet it never really caches more. Meanwhile as a functional GC'd VM language Elixir is constantly allocating and throwing away memory. Even Rust isn't perfect since it tends to promote RAII which involves allocating and throwing away memory.
One of the architectural decisions I did in sneedforo was to cache as much data as reasonable in memory. Avoiding queries to the DB. Imagine the insane performance gains Pleroma could have if it actually cached stuff in an intelligent way. Oh, but that's a little complicated, right? What is the states become stale, that could be a problem. Bro, just don't be a soydev and think about it hard.
@crunklord420@lain@graf@n-2-l@mint@Moon On the contrary, Pleroma is forced to cache too much because it needs a local representation of user user and post it cares about on the Fediverse. On Nostr you can actually just "lol, delete everything" with essentially no consequence because there are about 100 mirrors of the same content. So I would say the cache situation is harder and worse on the protocol level because devs are forced to confront this problem instead of better problems.
@kroner@alex@lain@graf@crunklord420@n-2-l@mint rust is great, so is elixir, pattern matching in elixir is next-level. but at the same time I find too many cases where you are supposed to do things functional and it just doesn't work if you have to do a lot of specific data validation
@crunklord420@alex@lain@graf@n-2-l@mint as long as we're on the topic, might as well start this fight up again: pleroma isn't slow/bloated because it uses json native in the database
@alex@lain@graf@n-2-l@mint@Moon Pleroma needs to talk to the DB less. While faster DBs are a nice idea, they will always been significantly slower than retrieving from local process memory. It's not just the overhead of (de)serialization, DBs must be hyper-generic and account for all possible scenarios. The reality is you can make reasonable assumptions about your data and those reasonable assumptions unlock worlds of optimizations.
@crunklord420@alex@lain@graf@n-2-l@mint pleroma using postgresql jsonb to directly store AP objects is controversial but i can't tell that it's bad in any way that I can measure. I guess space but if you want to throw away AP data use Mastodon. I'm serious.
Another thing someone should really fix asap is removing location data from uploaded pics, I’ve seen it a few times on fedi. I think the PPN does that btw so maybe there’s already an existing solution.
@bot@alex@lain@parker@graf@crunklord420@mint@Moon there are potential "issues" with removing meta data that might affect 0.1% of people. E.g. a photography instance may wish to have a "remove metadata" option turned off.
But yes, it should be turned on by default. Removing meta data is pretty simple with exiftool (it can even be used to check mimetype (and get past the simple trick of changing the extension)).
@p@lain@graf@mint@Moon people need to be reminded that the state of webdev is really bad and what's worse is people applaud each other for how bad it is.
> people need to be reminded that the state of webdev is really bad
The web is terrible, but try killing it. You're preaching to the choir, here. I hate the web. I won't touch JavaScript. I hate browsers. I use bloat as a frontend almost exclusively.
@p@lain@graf@mint@Moon I didn't read all this but I'm just going to say you can't throw your hands up and complain about the IO bottleneck when pleroma is leaving tons of memory on the table.
Databases are great, but they're a not replacement for local process memory. Pleroma isn't doing any schizo sharding hyper-galaxy-scale shit. It's just a single frontend and a postgresql database. Cache shit, CACHE CACHE CACHE PUT THINGS IN THE MEMORY AND THEN READ THEM FROM MEMORY MARK THEM AS DIRTY WHEN NEEDED AND REFRESH. WHO CARES IF IT'S THERE'S NANOSECOND ATOMIC RACE CONDITION OVER THE EXISTENCE OF A PEPE THE FROG IMAGE JUST PRESS F5 NIGGA ITS FAST
@p@Moon@graf@lain@mint also you will _NEVER_ get to performance nirvana if you think you can just throw your bullshit into some job framework bullshit bloat system and it's going to be great. Only _YOU_, the actual programmer of the actual software know what your data actually looks like can create an ideal solution. It's not even that hard, people think writing their own stuff is hard because they look at these megabloat frameworks and they're huge. It doesn't have to be huge when it's built around (correct) assumptions.
> also you will _NEVER_ get to performance nirvana if you think you can just throw your bullshit into some job framework bullshit bloat system and it's going to be great.
Good, I don't. I hate "frameworks". Like I said, you're addressing someone that isn't present. "I didn't read what you wrote but here are a bunch of replies criticizing some other person that made me mad on Reddit ten years ago." Take your goddamn meds, Kiwi.
> Pleroma is I/O bound partially because it was written in a language specifically designed to be incredibly slow and wasteful when it comes to reuse of memory.
This is self-contradictory.
It's I/O-bound because it's supposed to be I/O-bound. It is network software: if you have to have a faster CPU to saturate the pipe, you have fucked up. Data enters the pipe, data goes down another pipe, and if you have to do enough work that the flow is uneven, you are doing too much work. It's I/O-bound because it's architected correctly and written well. This has literally nothing to do with the language runtime. If anything, for the amount of string-mangling it has to do, it's impressively efficient for a program written in a functional language.
> Functional programming languages are totally orthogonal to how computers actually work and you can never take advantage of the properties of a computer if you view how a computer actually operates as a flaw that requires a rube goldberg machine to pretend doesn't exist.
I don't know who you're addressing. SQL doesn't match how a computer works, either, but because Postgres spends most of its time in iowait, SQL is fine. An anime girl doesn't match how a computer works, but somehow, JPEG decoding is never the bottleneck. awk is not how computers work, either, but a one-liner takes 30 seconds to write and will usually finish executing in less time than your Rust compiler takes to build a program that runs slower. You're gonna have tradeoffs anywhere, but only a complete HN-style idiot is capable of saying things like you have. People that are this wrong are usually not as loud. You'd think you'd have looked at K by now; APL is a functional branch and garbage collected and you'll have a hard time beating K in its domain.
And which computer, anyway? Forth is way too hard on the memory bus to perform well on an amd64 system, but it screams on an AVR, PIC, anything with a builtin stack.
Every single environment has tradeoffs in its runtime characteristics. (Make a goddamn compiler and look at how many decisions you have to make.) It's entirely possible to botch it so hard that there's nothing a given design does well, but an entire paradigm? A paradigm doesn't survive past the first paper these stupid meme positions mean that you're unable to think it through or evaluate anything. You end up the equivalent of the 50-year-old Java dude, but for Rust.
@p@lain@graf@mint@Moon Pleroma is I/O bound partially because it was written in a language specifically designed to be incredibly slow and wasteful when it comes to reuse of memory.
It cannot be repeated enough. Functional programming languages are totally orthogonal to how computers actually work and you can never take advantage of the properties of a computer if you view how a computer actually operates as a flaw that requires a rube goldberg machine to pretend doesn't exist. But of course, never will cease to exist because ultimately you are trying to get a computer to do a thing, not just turn on and get warm (as the intended purpose of functional programming languages).
> fundamentally speaking it will always be inferior to a non-VM, non-GC'd language.
Bullet-point meme objections. GC's fine, Pleroma's not memory-bound. VM's fine, it's not CPU-bound either. It's I/O-bound, leans hard on Postgres. Maybe we should add manual memory management to Postgres, start shipping binaries to the DB instead of letting it parse queries. Why bother with Rust anyway? C ticks off the same bullets and the compiler doesn't take a year to run.
Rust is the worst for "all this shit would be better if you people adopted my preferred silver bullet". You hand a browser text headers, you could do it in brainfuck. Rust is shit, won't fix a damn thing.
@p@lain@graf@mint@Moon security isn't the reason I want it rewritten in Rust. It's because I refuse to learn Elixir and fundamentally speaking it will always be inferior to a non-VM, non-GC'd language.
Mitra has the right idea just going ahead and using Actix. Actix/Tokio will absolutely annihilate whatever routing framework Pleroma uses (Phoenix?) .