Re: Huge generated UNION ALL faster than JOIN?

Ancoron Luciferis <ancoron.luciferis@xxxxxxxxxxxxxx> · Mon, 13 May 2019 00:23:14 +0200

On 12/05/2019 20:08, Tom Lane wrote:
> Ancoron Luciferis <ancoron.luciferis@xxxxxxxxxxxxxx> writes:
>> One of the things consuming most of the time was an Index Only Scan
>> executed millions of times. And on top came the Nested Loop which
>> finally reduced the rows but also took a lot of time to do so.
> 
>> Explain plan: https://explain.depesz.com/s/4GYT
> 
> The core problem you've got there is the misestimation of the join size:
> 
> Nested Loop (cost=0.71..0.30 rows=72,839,557 width=33) (actual time=19,504.096..315,933.158 rows=274 loops=1)
> 
> Anytime the planner is off by a factor of 250000x, it's not going to end
> well.  In this case, it's imagining that the LIMIT will kick in after just
> a very small part of the join is executed --- but in reality, the LIMIT
> is larger than the join output, so that we have to execute the whole join.
> With a more accurate idea of the join result size, it would have chosen
> a different plan.
> 
> What you ought to look into is why is that estimate so badly off.
> Maybe out-of-date stats, or you need to raise the stats target for
> one or both tables?
> 

I thought so as well and that's why I started investigating, but after
creating my own data set and a final analyze of both tables I ended up
with similar difference in estimation vs. actual:

https://explain.depesz.com/s/R7jp

Nested Loop (cost=25.17..514,965,251.12 rows=27,021,979 width=56)
(actual time=0.568..5.686 rows=274 loops=1)

...but this was fast due to the Merge Append being used and pushed-down
LIMIT.

>> My question now is why can't the optimizer generate a plan that in this
>> case does 114 loops of "events" scans instead of a million loops on the
>> "subscription_signal"?
> 
> I don't see any "events" table in that query, so this question isn't
> making a lot of sense to me.  But in any case, the answer probably boils
> down to "it's guessing that a plan like this will stop early without
> having to scan all of the large table".
> 
> 			regards, tom lane
> 

Yes, Timescale extension is mangling the partition names quite a lot. I
wonder if it would be possible to hold the result of the estimated
smaller reference data (114 subscription_signal.signal_id entries in
this case) in a VALUES list and then use that to filter the table with
the larger estimate instead of looping over.

Cheers,

	Ancoron