Conversation
Notices
-
Embed this notice
@MercurialBlack@mercurial.blogplpl.p.projecthermes.plplreommermercmercurial.mer.pl.mercurial.blog bitch
-
Embed this notice
@pernia don't lecture me on databases, nigger I AM databases
-
Embed this notice
@pernia no you dummy there is very little need to read from disk at all when serving AP objects, you just put the json in a file, then when faggot a requests /objects/cocksucker you just print it out at him. The EXACT same thing (plus more work) has to happen to read the json out of the db
out going federation is written just once to disk and sent out form where it is
you can literally just be clever about how you save posts you made, and the posts you are interested in reading, and you're done. no seeking no thing. webfinger can even be a static file everything can be static files if you're clever tat's how you can get AP servers written in such ridiculous ways
-
Embed this notice
@pwm well for writes and lookups i know relational databases use write-optimized and read-optimized data structures, specifically so you don't strain the disk so much and get more out of it. doing it straight of the disk would mean doing sweeps and shit to look stuff up, which is slower
-
Embed this notice
@pernia index it with a real db for internal use, if you're concerned or implementing some flouridated shit. OTHERWISE, the thing is laid out so nginx just serves everyhting straight off the disk. Doesn't get a lot faster than that.
-
Embed this notice
@pwm i wonder at what point performance is gonna suffer
-
Embed this notice
@pernia it does and I meant that was fine
-
Embed this notice
@pernia should have put "database" in quotes
-
Embed this notice
@pwm snac2 uses the disk right? or am i thinking of something else
-
Embed this notice
@pernia nyet database is fine
-
Embed this notice
@pwm and, yknow, a real database
-
Embed this notice
@pernia no I was proposing hacking it into snac2 to replicate mrfs
-
Embed this notice
@pwm sounds fun
snac2 honestly could use lua plugins. that would be so incredibly based it would kill pleroma
-
Embed this notice
@pwm does snac2 support those or are you doing haxxor shit rn
-
Embed this notice
@pernia okay compiled plugins written in c are a valid use case imo
-
Embed this notice
@MercurialBlack @pernia dlopen man page would be about all you would need to read.
-
Embed this notice
@pwm dlopen is a GLIBC FUNTION. that ONLY WORK with DYNAMICALLY LINKED PROGRAMS. this is a STATICALLY LINKED HOUSEHOLD and i will NOT ALLOW THIS ABOMINATION HERE
-
Embed this notice
@MercurialBlack @pernia it's just a couple function calls and some function pointers.
I have explained it clumsily.
-
Embed this notice
@MercurialBlack @pernia it would be simple, take where the message in and output things are, and add a registry of plugins,
on startup, have the executable load up libraries with dlopen, which can register themselves into the message processing pipeline (it can just be a linked list)
each plugin need merely implement a single function that accepts a message, and farts it back out the other end, for the next thing in the chain.
message flow can resume to normal once you hit the end of the plugin linked list
-
Embed this notice
I can't tell if you're joking about that being simple it sounds rather complicated
In theory I understand C I guess
CC: @pernia@cum.salon
-
Embed this notice
@MercurialBlack @pernia to cc him nuntil federation is complete
-
Embed this notice
I mean the CC is working on my end
And I have no idea how this thing works man
CC: @pernia@cum.salon @pwm@gh0st.live
-
Embed this notice
Ah
CC: @pernia@cum.salon
-
Embed this notice
@MercurialBlack @pernia does snac2 have plugins? you should patch it in
-
Embed this notice
@MercurialBlack @pernia it's him, not you I think
-
Embed this notice
@MercurialBlack @pernia I think you two have not federated yet, he couldn't tag you in a post
-
Embed this notice
Idk I saw a few of his replies to vriska earlier
CC: @pernia@cum.salon
-
Embed this notice
@pernia
1: @MercurialBlack
2: @MercurialBlack
-
Embed this notice
I can see this but can't see what it's in reply to yet
CC: @pernia@cum.salon
-
Embed this notice
cc @pwm maybe it didn't feddyrate
-
Embed this notice
@pernia @nimt @pwm Tell him to fix his shit.
Husky_1721383757011_LGDV5N2FX4.…
-
Embed this notice
@pwm cc @nimt @mint do u know why it might not be federating? seems wierd to me pleroma wont even let me force a tag
-
Embed this notice
@pwm total cc rape
-
Embed this notice
@mint @nimt @pernia @pwm having a very long domain name may not have helped in this case
-
Embed this notice
@pernia @vic caddy is for zoomers who are scared of config files longer than 6 lines
-
Embed this notice
@vic @pwm caddy is for NIGGERS.
nah but i've heard its good. just wanna do shit the autistic way and use as much from base as possible
-
Embed this notice
@pernia @pwm give caddy a shot
-
Embed this notice
@pernia
> i assume by page manager u mean the mmu?
the page manager is the component of the database (it's part of the software, not the OS) responsible for reading and writing pages. It usually has a LRU cache of pages which it has recently fetched from disk so it can sometimes return them quicker. Pages can come in several types that indicate what information is stored in them (data tuples, table definitions, indexes, mappings of tables to which pages contain data for that table) but the big one here is the data page. Pages are addressed by their page number, which is literally just the order they are in (usually). A data page holds data tuples. Data tuples are the rows in a table, and they can be logically addressed by (page_number, row_id).
> wouldn't moving json from disk to memory have to happen anyway? why would it be slower in a db than from disk?
It does have to happen anyway but when you get it from a database instead of ripping it straight from a file, first you have to go and find the data you want, and then call fopen. If you just know which file you want to rip json out of, then you can skip all the work of locating it, and just call fopen.
> and wouldn't reading the data from disk be faster since its a B tree, rather than reading the file sequentially?
indexes are b trees, data tuples are just sort of chucked in there in the order they are created usually unless you are doing something fancy like maintaining a physical sort order within the pages, which would be really expensive for CRUD operations as you would have to shuffle potentially your entire table around for every insert.
> then in scenario b, that would mean reading the file sequentially to load it from disk to memory,
nginx does this and it does it in fancy optimized ways that stream the file, rather than load the entire file into memory in one big buffer and then flush it out.
Scenario b is faster if you engineer the files to be laid out in such a way that you don't have to look for them. Placing them strategically means that you just know where they are based on filename. If you did have to search them with like grep and shit then yes that would be much slower.
You have some misconceptions about where exactly the b-tree comes into play. The b-tree powers indexes. To fetch indexed data you first consult the index by traversing its b-tree (fast), and then you still have to fetch the data from its data page if the index tuple wasn't indexing the field you wanted in the first place (which it wasn't in our scenario). The index IS way faster than doing a sequential scan of every datapage that has data for a given table, and checking each tuple in it for the one or however many your query wants. With an index you know the address of the data you want but you still have to fetch it off the disk (unless it's cached by the page manager but let's pretend it isn't).
The database CAN'T be faster than simply reading off a static file. It is simply more work to be done, work that is a superset of the work done by just ripping the file off the disk and out onto the network.
The limitation is that not every scenario allows you to engineer the database out of the picture. This is not a universally applicable strategy. The database offers flexibility and makes difficult things possible, but the realization here is that all you're really doing is serving a static file, and that this isn't necessarily a difficult thing (if you're clever about it).
-
Embed this notice
@pwm damn, ok thats really cool. i forgor the Btree is for just index shit mb.
snac2 is fucking BASED then damn. and is nginx the only webserver that streams files? does openbsd's httpd have those funi optimizations u think?
-
Embed this notice
@pernia
okay so scenario a:
You need json from a row in a database (one of your posts) because someone wanted you to serve it so that it federates or some shit. we also suppose the posts are indexed by like, id and that we have that from the request. The database has to check an index for the id of that post, which is pretty quick, BUT then it has to actually go get the json (it probably wouldn't be in the index -- this is called a covering index, if it has all the information you need --, because that would effectively double up the size of the object data) so it looks at the page number and row id pointed to by the index (this is the logical location of the data in the database).
Then the database asks the page manager for the page it needs. In this scenario that page is not in memory, and the page manager must read it from disk. with this page loaded into memory, we then grab the tuple we wnat BUT WAIT, the json data is bigger than the remaining size available in the page and spills over into another page so we have to ask the page manager for that page, and any subsequent spillover pages until we are done reading all the json we want (each page fetch necessitates a new disk read if that page is not already in memory, and, since these are pages full of nothing but json from one row, it's highly likely that they are not already in memory).
THEN after all that we can stream the json data out to whatever is handling the http response
OR
scenario b:
We receive a request for an object. We have cleverly named all our objects to that the file path maps to the url path, and have told nginx about this mapping and where to look.
nginx just serves the file (it is very fast at this), and the request never touches our backend.
-
Embed this notice
@pwm ok so if i understand correctly:
>backend searched object by id in db
>db check with page manager to get data off disk into memory
>once in memory, u can send the data thru internetz
thats scenario a. so i assume by page manager u mean the mmu? or the one provided by the OS (dont know terminology) and not something else. In that case, wouldn't moving json from disk to memory have to happen anyway? why would it be slower in a db than from disk?
and wouldn't reading the data from disk be faster since its a B tree, rather than reading the file sequentially?
then in scenario b, that would mean reading the file sequentially to load it from disk to memory, with the page manager/mmu doing its thing, but having nginx do it directly rather than it going thru snac2 first. so to be faster the db+backend overhead would have to be greater than the savings u get from the B tree.
i'm sure i'm missing a few things here. idk what u mean by "page manager" and which tuples ur talking abt.
-
Embed this notice
@pwm hmm damn. explain to me the "+more work" i didn't think it would be like that
-
Embed this notice
@pwm @pernia @vic "Yeah I Want My WEB SERVER To Also Be An ACME CLIENT" - statements dreamt by utterly deranged.
-
Embed this notice
@pwm @pernia @vic Just put certbot/acme.sh/whatever into crontab, how fucking hard could that be.
-
Embed this notice
@kirby @nimt @pernia @pwm Indeed, the hardcoded email regex I modified for IPv6 mentions support ( @flint ) can't handle his 81 character long domain. Frankly, not my problem.
-
Embed this notice
@pernia @flint @nimt @pwm @kirby I just fixed the regex, actually, but I'm not feeling like updating and restarting pleromers just for this single change.
-
Embed this notice
@mint @flint @nimt @pwm @kirby rip. So theres nothing we can do?