Conversation
Notices
-
Embed this notice
I've got a question where to place the functions to return data for the reports and to set the status. There is the report "factory" and the report "repository". At least I guess that the functions wouldn't be placed in this report "entity".
Concerning constants: I guess that they are placed in the "entity" class, aren't they?
-
Embed this notice
@heluecht Entity-related constants are placed in the Entity. What kind of return data are we talking about? To update the status of a report, it would be in the Entity.
-
Embed this notice
@heluecht The problem of setting and storing arbitrary values is that we don't have any control nor visibility on it. By using meaningfully named methods affecting a fixed list of private class variables, we have complete control over what any part of the code interact with the Entity. No code can set dismissed to 2 as it would be possible with a setDismissed(int $dismiss) method for example.
It's even worse with arbitrary array. update(array $fields) doesn't tell you what value type are expected, what values are required, etc... Worse, by tightly coupling the object and the persistence layer (the database), it makes it harder to write unit tests for our Entity.
So we split it. On one hand, the Entity that describes the exact data it needs and the only approved operations on it, with absolutely no care how it's going to be stored. Memory, key-value store, NoSQL database, it doesn't matter to the Entity.
On the other hand, we have the Repository, which knows how to persist and retrieve Entity object to the persistence layer of choice (here, MySQL only for now), but no more so that we keep the hard-to-test code surface as small as possible.
-
Embed this notice
@hypolite I don't understand why we should split the setting of the status in two steps? Why not directly set and store the status? I don't see any sense in storing (and overwriting) all the existing fields when we only want to change some little status field.
Concerning "createFromTableRow": I don't want to return just the array of the uri-ids there, but some array with enhanced data per record. So I guess that function needs some enhancement.
-
Embed this notice
@heluecht The status values are Entity constants, the dismiss() method is an Entity method (don't forget to call Repository->save() after!). I want to avoid get* and set* methods in the Entity classes. Just create methods for the use-cases we have. Do we dismiss reports? dismiss(). Do we reopen them if they were dismissed? reopen().
Querying a list of reports based on conditions is in the Repository since it's interacting with the database. You can create a joined query, then use Factory->createFromTableRow to obtain Report entities, and you can use the rest of the row fields for what you want. Watch out for conflicting fields, the id field should always be the Entity ID.
-
Embed this notice
@hypolite I thought that the "entity" only contains the class variables?
What I want to do is to create constants for the status. So they are "entity-related", I guess.
Also I want to create functions like "setDismissed" (and similar) to update the status. Also some "getOpenReports" that should return an array of reports. Concerning the returned data, I also want to return an enhanced set of data. Means: I don't only want to return the contact id of the reported contact and the uri-id of the posts, but also all needed contact related data (nick, name, url, avatar, ...) and the post related data (like title, body, creation date, ...). For this I want to create some view that then has to be called inside of that class.
-
Embed this notice
@heluecht We do this because we keep getting reports of missing keys in arrays and unexpected behavior where only the inspection of arbitrary arrays allows us to troubleshoot the issue. Introducing coding rigidity does have a cost in terms of raw performance, but then again we code in PHP, so we aren't gunning for pure performance, are we?
And yes, I understand the annoyance to have to add new concepts to thwarts the limitations of the new concepts, etc... but it has to be compared with the whack-a-mole game of finding out when and where an arbitrary array got its wrong/missing value. In a system that processes asynchronous requests like a federated one, I believe it would pay off if we could abstract remote systems to test our code instead of having to send ourselves some message from an actual remote server and then anxiously poring over the log to spot the var_dump() call output we had to add on a production node.
If it doesn't bother you, more power to you but I'd like to prepare the project for a time you won't be around anymore. In its current shape, I believe the project development would grind to a halt if you stopped contributing for any reason, because you (and only you) have readily done this kind of debugging in the wild to fix the protocol/federation aspects.
We're still light-years away from the complexity of the Symfony or Laravel PHP frameworks and my goal isn't to get anywhere near either of them. We also could use a third-party Object-Relational Mapping library like Doctrine to outsource the tedious task of finding out the inefficiencies of hydrating typed object from a database and fixing them one at a time.
-
Embed this notice
@hypolite This sounds like coding with one hand tied on the back and then having the need to invent some workaround to counter the self caused limitations.
So the easy task of setting the status will now consist of:
- fetching all report fields (although we don't need the content for our task)
- fetching all associated side table fields (the uri-ids) (although this is of no interest here)
- processing the fields
- returning the content
- changing one value (the status)
- storing all fields in the report table
- delete all table entries for that report in the side table with the uri-ids
- insert all table entries again, although we haven't change anything there
And this process will even getting worse, when we have the need to set the status for a whole bunch of reports (when for example 10 people report the same, then we should be able to easily close 9 reports). In that case - instead of issuing a single update command with the array of the report ids - we had to execute the above steps in some loop. This will slow down the whole process and will increase the database load as well.
I understand the reason why to work with class variables instead of associated arrays when handling with fetched data. But I don't see a reason why to limit ourselves upon updating as well.
-
Embed this notice
@heluecht Some ORM systems have resolved this particular issue by saving the values as they were when the Entity was instantiated from the database as “clean”, and setting the fields that have been updated as “dirty”. Then when persisted, they only use the “dirty” fields. I do think we’d easily be able to implement a version of this clean (coming straight from the DB)/dirty (needs to be persisted) concept.
-
Embed this notice
@hypolite From a database perspective it is really bad practice to store data that is already stored. In this case it is especially bad since there are depending tables (like the one with the reported posts) that with each update is deleted and inserted over and over again.
Also when this technique was applied to some other places that have got a higher risk of race conditions we run into nightmare situations where some fields had been reverted to older situations because there had been some other (parallel) task between setting the class variables and saving all the fields. Let's say we had a situation where the one process set the avatar of a contact and a parallel process is setting the date of the last item. With the right (or better: bad) timing, the avatar or the date of last item was reverted when we worked that way.
I do get your point of input validation. But that solution opens up a bunch of other problems that are even worse than the original problem.