Re: How to hint 2 coulms IS NOT DISTINCT FROM each other

Kim Rose Carlsen <krc@xxxxxxxx> · Thu, 10 Nov 2016 18:40:30 +0000

> Hang on -- upthread the context was inner join, and the gripe was join

> fast with '=', slow with INDF.  When he said the nulls were

> 'generated', I didn't follow that they were part of the original

> query.  If the nulls are generated along with the query, sure, an

> index won't help.

>

> I maintain my earlier point; with respect to the original query, to

> get from performance of INDF to =, you have three options:

> a) expr index the nulls  (assuming they are physically stored)

> b) convert to ((a = b) or a is null and b is null) which can help with

> a bitmap or plan

> c) covert to union all equivalent of "b"

> 

> merlin

a) and b) would be workaround that would run an order of magnitude slower. The query 

starts with a full table scan of a large table. If the planner had started elsewhere it could

have reduced the result to 1-2 rows from the start. It won't choose this plan without the help

from =.

c) could be a acceptable workaround, but it would clutter up if you would want more 

than one column to be IS NOT DISTINCT FROM. You end up with 2^n unions to simulate 

IS NOT DISTINCT FROM.

Without knowing the work required, I will still argue that
having IS NOT DISTINCT FROM 
use the same transitive rules as equality,
  would be a
better approach. 

With fear of talking about things I know little(nothing) of, I think the description of EquivalenceClasses 
in postgres/src/backend/optimizer/README, should
 be extended to also include EquivalenceClasses
of IS NOT DISTINCT FROM.