On Thu, Jun 2, 2022 at 12:32 AM Tom Lane <tgl@xxxxxxxxxxxxx> wrote:
Jeff Janes <jeff.janes@xxxxxxxxx> writes:
> On Tue, May 31, 2022 at 4:04 PM Tim Kelly <gtkelly@xxxxxxxxxxxxxxxxx> wrote:
>> I do not see evidence that the nested loop is trying to reduce overhead
>> by using the smaller set. It seems to want to scan on data first either
>> way.
> The planner probably doesn't know which one is smaller.
There is not a lot of daylight between the cost estimates for
"a nestloop-join b" and "b nestloop-join a", if we're considering
plain seqscans on both tables and all else is equal. It tends to
come down to factors like which one is more densely populated.
As best I can tell, the issue Tim's unhappy about is not so
much the use of a nestloop as the lack of use of any index.
But it is using an index on one of the tables, on "id". There is no reason it would not be able to reverse that, doing the seq scan on the smaller (in assumed bytes) table and using the id index on the larger (in bytes) table, to great effect.
Based on the timing of the simple counts on "data" with and without the WHERE, I'm pretty sure that what is going on here is that data.content is large and resides mostly in TOAST. When TOAST needs to be accessed it is much slower than when it doesn't. And that is also the cause of the estimation problem, the oversized values are just assumed to be distinct, and no histogram is generated. Without histogram boundaries to serve as a randomish sample, the selectivity estimate falls back to something even worse (pattern length), and gets it massively wrong.
But "string like '%foo%'" is not at all optimizable with a
btree index. You might be able to get somewhere with a
pg_trgm GIN or GIST index.
I agree with the recommendation, but not really the reasoning. Having the pg_trgm index on metadata.author is might be even better than just getting the planner to do the right thing without the index, but getting the planner to do the right thing even without the index would also be a big improvement over the current plan, if there were just a way to do it. If the planner explicitly accounted for TOAST cost, that would probably do it. Or if the selectivity estimates on data.content were better, that would too. But Tim can't reasonably do anything about those things, while he can build the index.
Another thing he could try would be to force the correct index use by using the inner join, but writing the join condition as "on data.id = metadata.id||''"
Cheers,
Jeff