1) OS/Configuration 64-bit Windows 7 Enterprise with 8G RAM and a Dual Core 2.67 Ghz Intel CPU postgresql-x64-9.0 (PostgreSQL 9.0.1, compiled by Visual C++ build 1500, 64-bit) work_mem 2GB shared_buffers = 2 2) Dataset name,pages,tuples,pg_size_pretty "pivotbad";1870;93496;"15 MB" "pivotgood";5025;251212;"39 MB" 3) EXPLAIN (ANALYZE ON, BUFFERS ON) "Hash Join (cost=16212.30..52586.43 rows=92869 width=17) (actual time=25814.222..32296.765 rows=3163 loops=1)" " Hash Cond: (((pb.id)::text = (pg.id)::text) AND ((pb.question)::text = (pg.question)::text))" " Join Filter: ((COALESCE(pb.response, 'MISSING'::character varying))::text <> (COALESCE(pg.response, 'MISSING'::character varying))::text)" " Buffers: shared hit=384 read=6511, temp read=6444 written=6318" " -> Seq Scan on pivotbad pb (cost=0.00..2804.96 rows=93496 width=134) (actual time=0.069..37.143 rows=93496 loops=1)" " Buffers: shared hit=192 read=1678" " -> Hash (cost=7537.12..7537.12 rows=251212 width=134) (actual time=24621.752..24621.752 rows=251212 loops=1)" " Buckets: 1024 Batches: 64 Memory Usage: 650kB" " Buffers: shared hit=192 read=4833, temp written=4524" " -> Seq Scan on pivotgood pg (cost=0.00..7537.12 rows=251212 width=134) (actual time=0.038..117.780 rows=251212 loops=1)" " Buffers: shared hit=192 read=4833" "Total runtime: 32297.305 ms" 4) INDEXES I can certainly add an index but given the table sizes I am not sure if that is a factor. This by no means is a large dataset less than 350,000 rows in total and 3 columns. Also this was just a quick dump of data for comparison purpose. When I saw the poor performance on the COALESCE, I pointed the data load to SQL Server and ran the same query except with the TSQL specific ISNULL function. |