Re: is (not) distinct from

Adrian Klaver <adrian.klaver@xxxxxxxxxxx> · Tue, 28 Feb 2017 07:06:31 -0800

On 02/28/2017 12:08 AM, Johann Spies wrote:
When I query table a I get 18 rows.
The same query on table b results in 28 rows.

Both tables have the same structure.

When I export the results to csv-files and do a diff it confirms that
all 18 rows from a are also in b. Table b has 10 new rows.

When I combine these queries and use "is (not) distinct from"
I get strange results:

with a as (select citing_article, cited_article, pubyear, year_cited,
cited_author, cited_title, cited_work, doi
from wos_2017_1.citation
where citing_article='abcdefg'
order by 3,4,5,6,8),
b as (
select citing_article, cited_article, pubyear, year_cited, cited_author,
cited_title, cited_work, doi
from wos_2017_1.citationbackup
where citing_article='abcdefg'
order by 3,4,5,6,8)
select distinct b.* from b , a
where
( B.citing_article,
             B.cited_article,
             B.pubyear,
             B.year_cited,
             B.cited_author,
             B.cited_title,
             B.cited_work,
B.doi)
 is distinct from
        (A.citing_article,
             A.cited_article,
             A.pubyear,
             A.year_cited,
             A.cited_author,
             A.cited_title,
             A.cited_work, A.doi)

The result of this query is 28 rows - thus that of b.
I expected this to be 10.

If I change the query to "is not distinct from" it results in 18 rows
which is what I would have expected.

I have not worked through all this but at first glance I suspect:

select distinct b.* from b ...

is distinct from ...

constitutes a double negative.

What happens if you eliminate the first distinct?

Regards
Johann.
--
Because experiencing your loyal love is better than life itself,
my lips will praise you.  (Psalm 63:3)

--
Adrian Klaver
adrian.klaver@xxxxxxxxxxx

--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general