On Wed, Jun 10, 2015 at 9:39 AM, Johann Spies <johann.spies@xxxxxxxxx> wrote: > COPY > (SELECT A.ut, > B.go AS funding_org, > B.gn AS grant_no, > C.gt AS thanks, > D.au > FROM isi.funding_text C, > isi.rauthor D, > isi.africa_uts A > LEFT JOIN isi.funding_org B ON (B.ut = A.ut) > WHERE (C.ut IS NOT NULL > OR B.ut IS NOT NULL) > AND D.rart_id = C.ut > AND C.ut = B.ut > GROUP BY A.ut, > GO, > gn, > gt, > au > ORDER BY funding_org) TO '/tmp/africafunding2.csv' WITH csv quote '"' > DELIMITER ','; > > > A modified version of this query finished in 1min 27 sek: > > COPY > (SELECT 'UT'||A.ut, > B.go AS funding_org, > B.gn AS grant_no, > C.gt AS thanks > FROM isi.africa_uts A > LEFT JOIN isi.funding_org B ON (B.ut = A.ut) > LEFT JOIN isi.funding_text C ON (A.ut = C.ut) > WHERE (C.ut IS NOT NULL > OR B.ut IS NOT NULL) > GROUP BY A.ut, > GO, > gn, > gt) TO '/tmp/africafunding.csv' WITH csv quote '"' DELIMITER > ','; > > > As I said, the process of 'explain analyze' of the problematic query > contributed to the 173GB > temporary files and did not finish in about 16 hours. The joins are different on both versions, and the most likely culprit is the join against D. It's probably wrong, and the first query is building a cartesian product. Without more information about the schema it's difficult to be sure though. -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance