The startup cost is pretty expensive. This seems to be common issue using partition wise joins.
I attached a simplified reproducer. Thanks for having a look!
Regards
Arne
From: Tom Lane <tgl@xxxxxxxxxxxxx>
Sent: Friday, February 26, 2021 4:00:18 AM
To: Arne Roland
Cc: pgsql-performance@xxxxxxxxxxxxxx
Subject: Re: Disabling options lowers the estimated cost of a query
Sent: Friday, February 26, 2021 4:00:18 AM
To: Arne Roland
Cc: pgsql-performance@xxxxxxxxxxxxxx
Subject: Re: Disabling options lowers the estimated cost of a query
Arne Roland <A.Roland@xxxxxxxx> writes:
> I want to examine the exhaustive search and not the geqo here. I'd expect the exhaustive search to give the plan with the lowest cost, but apparently it doesn't. I have found a few dozen different querys where that isn't the case. I attached one straight forward example. For the join of two partitions a row first approach would have been reasonable.
Hmm. While the search should be exhaustive, there are pretty aggressive
pruning heuristics (mostly in and around add_path()) that can cause us to
drop paths that don't seem to be enough better than other alternatives.
I suspect that the seqscan plan may have beaten out the other one at
some earlier stage that didn't think that the startup-cost advantage
was sufficient reason to keep it.
It's also possible that you've found a bug. I notice that both
plans are using incremental sort, which has been, um, rather buggy.
Hard to tell without a concrete test case to poke at.
regards, tom lane
> I want to examine the exhaustive search and not the geqo here. I'd expect the exhaustive search to give the plan with the lowest cost, but apparently it doesn't. I have found a few dozen different querys where that isn't the case. I attached one straight forward example. For the join of two partitions a row first approach would have been reasonable.
Hmm. While the search should be exhaustive, there are pretty aggressive
pruning heuristics (mostly in and around add_path()) that can cause us to
drop paths that don't seem to be enough better than other alternatives.
I suspect that the seqscan plan may have beaten out the other one at
some earlier stage that didn't think that the startup-cost advantage
was sufficient reason to keep it.
It's also possible that you've found a bug. I notice that both
plans are using incremental sort, which has been, um, rather buggy.
Hard to tell without a concrete test case to poke at.
regards, tom lane
Attachment:
optimizer_first_rows.sql
Description: optimizer_first_rows.sql