I ended up with a PostgreSQL query that takes 1.5h to transform a data set of 6GB.🤨
Feels like an eternity. 🤔
https://gist.github.com/smarr/5a30c34cc38e274f634b151d9763f93c
I ended up with a PostgreSQL query that takes 1.5h to transform a data set of 6GB.🤨
Feels like an eternity. 🤔
https://gist.github.com/smarr/5a30c34cc38e274f634b151d9763f93c
Virtually all your time in that query is spent on the left join. Which leads me to two questions.
1) are you creating a a index on runId, trialId, criterion, invocation;
2) why are you using group by, its been a little bit since i had to write an sql statement but since you are creating a table im inclear why you have groupby in there.
@freemo@mastodon.acm.org
The concrete query is in the Gist here: https://gist.github.com/smarr/5a30c34cc38e274f634b151d9763f93c
On the partial table, an index on (runId, trialId, criterion, invocation) helps. Though, since this converts the full table, adding that index doesn't make any performance difference for the full conversion.
I added a comment on the Gist with the `EXPLAIN ANALYZE` output. It's the first time doing performance work on this type of stuff, so, many unknowns on my end....
@smarr fyi, i need the exact query you used to help out.
@freemo hm, I would assume there's an index on the primary key:
https://github.com/smarr/ReBenchDB/blob/2414426bb93a9c170004a97207593217f02f622b/src/backend/db/db.sql#L152
Would you have a suggestion what other indexes might be helpful here? Thanks.
@smarr indexes are your friend :)
@smarr can you share with me the table schema, assuming im being helpful at all. If this isnt a problem you actually need to fix let me know. But if its a problem you need a hand on id be happy to dig deeper and see if i can help.
@freemo 1) yes. there's an index. And the explain shows it's used with the index scan.
2) The group by is needed to get the iteration data together from which the array is created.
The long story is given here: https://stackoverflow.com/q/78130781/916546 with examples how the data looks like.
Sounds a lot less trivial than I first suspected considering the amount of work you did framing the question.. I am looking, the schema was very helpful thanks.
I was very good at this sort of problem back when i did it. Its been a few years, so I am a bit rusty. But I'll do my best to help.
It is late, so I dont expect a solution tonight, but if I have any insights ill let ya know. At first glance I fail to see why this is even difficult, which is likely a failing on my part.
@freemo the full schema is here:
GNU social JP is a social network, courtesy of GNU social JP管理人. It runs on GNU social, version 2.0.2-dev, available under the GNU Affero General Public License.
All GNU social JP content and data are available under the Creative Commons Attribution 3.0 license.