On 02/28/2017 12:08 AM, Johann Spies wrote:
When I query table a I get 18 rows. The same query on table b results in 28 rows. Both tables have the same structure. When I export the results to csv-files and do a diff it confirms that all 18 rows from a are also in b. Table b has 10 new rows. When I combine these queries and use "is (not) distinct from" I get strange results: with a as (select citing_article, cited_article, pubyear, year_cited, cited_author, cited_title, cited_work, doi from wos_2017_1.citation where citing_article='abcdefg' order by 3,4,5,6,8), b as ( select citing_article, cited_article, pubyear, year_cited, cited_author, cited_title, cited_work, doi from wos_2017_1.citationbackup where citing_article='abcdefg' order by 3,4,5,6,8) select distinct b.* from b , a where ( B.citing_article, B.cited_article, B.pubyear, B.year_cited, B.cited_author, B.cited_title, B.cited_work, B.doi) is distinct from (A.citing_article, A.cited_article, A.pubyear, A.year_cited, A.cited_author, A.cited_title, A.cited_work, A.doi) The result of this query is 28 rows - thus that of b. I expected this to be 10. If I change the query to "is not distinct from" it results in 18 rows which is what I would have expected.
I have not worked through all this but at first glance I suspect: select distinct b.* from b ... is distinct from ... constitutes a double negative. What happens if you eliminate the first distinct?
Regards Johann. -- Because experiencing your loyal love is better than life itself, my lips will praise you. (Psalm 63:3)
-- Adrian Klaver adrian.klaver@xxxxxxxxxxx -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general