On 12/05/2019 20:08, Tom Lane wrote: > Ancoron Luciferis <ancoron.luciferis@xxxxxxxxxxxxxx> writes: >> One of the things consuming most of the time was an Index Only Scan >> executed millions of times. And on top came the Nested Loop which >> finally reduced the rows but also took a lot of time to do so. > >> Explain plan: https://explain.depesz.com/s/4GYT > > The core problem you've got there is the misestimation of the join size: > > Nested Loop (cost=0.71..0.30 rows=72,839,557 width=33) (actual time=19,504.096..315,933.158 rows=274 loops=1) > > Anytime the planner is off by a factor of 250000x, it's not going to end > well. In this case, it's imagining that the LIMIT will kick in after just > a very small part of the join is executed --- but in reality, the LIMIT > is larger than the join output, so that we have to execute the whole join. > With a more accurate idea of the join result size, it would have chosen > a different plan. > > What you ought to look into is why is that estimate so badly off. > Maybe out-of-date stats, or you need to raise the stats target for > one or both tables? > I thought so as well and that's why I started investigating, but after creating my own data set and a final analyze of both tables I ended up with similar difference in estimation vs. actual: https://explain.depesz.com/s/R7jp Nested Loop (cost=25.17..514,965,251.12 rows=27,021,979 width=56) (actual time=0.568..5.686 rows=274 loops=1) ...but this was fast due to the Merge Append being used and pushed-down LIMIT. >> My question now is why can't the optimizer generate a plan that in this >> case does 114 loops of "events" scans instead of a million loops on the >> "subscription_signal"? > > I don't see any "events" table in that query, so this question isn't > making a lot of sense to me. But in any case, the answer probably boils > down to "it's guessing that a plan like this will stop early without > having to scan all of the large table". > > regards, tom lane > Yes, Timescale extension is mangling the partition names quite a lot. I wonder if it would be possible to hold the result of the estimated smaller reference data (114 subscription_signal.signal_id entries in this case) in a VALUES list and then use that to filter the table with the larger estimate instead of looping over. Cheers, Ancoron