On Sat, 17 Aug 2024, Tom Lane wrote:
Well, yes: the two aggregates (array_agg and count) are computed concurrently in a single Aggregate plan node scanning the output of the JOIN. There's no way to apply the HAVING filter until after the aggregation is finished. I think this approach is basically forced by the SQL standard's semantics for grouping/aggregation.
FWIW I also tried: HAVING array_length(array_agg(run_n), 1) < 10; but I saw the same amount of temp files, at least in the short duration of my test run. Thank you, I will split this into two passes like you suggested. It's just that I'm doing another 3 passes over this table for different things I calculate (different GROUP BY, different WHERE clauses) and I was hoping to minimize the time spent. But avoiding the array_agg() over everything is my top priority ATM so I'll definitely try. Regards, Dimitris