Hi, > On 20. Aug 2019, at 19:32, Andres Freund <andres@xxxxxxxxxxx> wrote: > > Hi, > > On 2019-08-20 17:11:58 +0200, Felix Geisendörfer wrote: >> >> HashAggregate (cost=80020.01..100020.01 rows=2000000 width=8) (actual time=19.349..23.123 rows=1 loops=1) > > FWIW, that's not a mis-estimate I'm getting on master ;). Obviously > that doesn't actually address your concern... I suppose this is thanks to the new optimizer support functions mentioned by Michael and Pavel? Of course I'm very excited about those improvements, but yeah, my real query is mis-estimating things for totally different reasons not involving any SRFs. >> I'm certainly a novice when it comes to PostgreSQL internals, but I'm >> wondering if this could be fixed by taking a more dynamic approach for >> allocating HashAggregate hash tables? > > Under-sizing the hashtable just out of caution will have add overhead to > a lot more common cases. That requires copying data around during > growth, which is far far from free. Or you can use hashtables that don't > need to copy, but they're also considerably slower in the more common > cases. How does PostgreSQL currently handle the case where the initial hash table is under-sized due to the planner having underestimated things? Are the growth costs getting amortized by using an exponential growth function? Anyway, I can accept my situation to be an edge case that doesn't justify making things more complicated. >> 3. Somehow EXPLAIN gets confused by this and only ends up tracking 23ms of the query execution instead of 45ms [5]. > > Well, there's plenty work that's not attributed to nodes. IIRC we don't > track executor startup/shutdown overhead on a per-node basis. I didn't know that, thanks for clarifying : ).