2010/11/17 Humair Mohammed <humairm@xxxxxxxxxxx>: > > There are no indexes on the tables either in SQL Server or Postgresql - I am > comparing apples to apples here.ÂI ran ANALYZE on the postgresql tables, > after that query performance times are still high 42 seconds with COALESCE > and 35 seconds with IS DISTINCT FROM. > Here is the execution plan from Postgresql for qurey -Âselect pb.id from > pivotbad pb inner join pivotgood pg on pb.id = pg.id and pb.question = > pg.question and coalesce(pb.response,'MISSING') <> > coalesce(pg.response,'MISSING') > Execution Time: 42 seconds > "Hash Join Â(cost=16212.30..48854.24 rows=93477 width=17)" > " ÂHash Cond: (((pb.id)::text = (pg.id)::text) AND ((pb.question)::text = > (pg.question)::text))" > " ÂJoin Filter: ((COALESCE(pb.response, 'MISSING'::character varying))::text > <> (COALESCE(pg.response, 'MISSING'::character varying))::text)" > " Â-> ÂSeq Scan on pivotbad pb Â(cost=0.00..2804.96 rows=93496 width=134)" > " Â-> ÂHash Â(cost=7537.12..7537.12 rows=251212 width=134)" > " Â Â Â Â-> ÂSeq Scan on pivotgood pg Â(cost=0.00..7537.12 rows=251212 > width=134)" this is little bit strange - did you ANALYZE and VACUUM? please send result of EXPLAIN ANALYZE Pavel > > And here is the execution plan from SQL Server for query -Âselect pb.id from > pivotbad pb inner join pivotgood pg on pb.id = pg.id and pb.question = > pg.question and isnull(pb.response,'ISNULL')<> Âisnull(pg.response,'ISNULL') > Execution Time: < 1 second > Cost: 1% Â|--Parallelism(Gather Streams) > Cost: 31% Â Â Â |--Hash Match(Inner Join, HASH:([pb].[ID], > [pb].[Question])=([pg].[ID], [pg].[Question]), > RESIDUAL:([master].[dbo].[pivotbad].[ID] as > [pb].[ID]=[master].[dbo].[pivotgood].[ID] as [pg].[ID] AND > [master].[dbo].[pivotbad].[Question] as > [pb].[Question]=[master].[dbo].[pivotgood].[Question] as [pg].[Question] AND > [Expr1006]<>[Expr1007])) > ÂÂ ÂCost: 0% Â|--Bitmap(HASH:([pb].[ID], [pb].[Question]), > DEFINE:([Bitmap1008])) > ÂÂ Â Â Â Â ÂCost: 0% Â Â|--Compute > Scalar(DEFINE:([Expr1006]=isnull([master].[dbo].[pivotbad].[Response] as > [pb].[Response],'ISNULL'))) > ÂÂ Â Â Â Â ÂCost: Â6% Â |--Parallelism(Repartition Streams, Hash > Partitioning, PARTITION COLUMNS:([pb].[ID], [pb].[Question])) > ÂÂ Â Â Â Â ÂCost: 12% Â|--Table Scan(OBJECT:([master].[dbo].[pivotbad] AS > [pb])) > ÂÂ Â Â Â Â ÂCost: 0% |--Compute > Scalar(DEFINE:([Expr1007]=isnull([master].[dbo].[pivotgood].[Response] as > [pg].[Response],'ISNULL'))) > ÂÂ Â Â Â Â Â Â ÂCost: 17% |--Parallelism(Repartition Streams, Hash > Partitioning, PARTITION COLUMNS:([pg].[ID], [pg].[Question])) > ÂÂ Â Â Â Â Â Â Â Â ÂCost: 33% |--Table > Scan(OBJECT:([master].[dbo].[pivotgood] AS [pg]), > WHERE:(PROBE([Bitmap1008],[master].[dbo].[pivotgood].[ID] as > [pg].[ID],[master].[dbo].[pivotgood].[Question] as [pg].[Question]))) > > > >> From: pavel.stehule@xxxxxxxxx >> Date: Tue, 16 Nov 2010 08:12:03 +0100 >> Subject: Re: >> To: humairm@xxxxxxxxxxx >> CC: pgsql-performance@xxxxxxxxxxxxxx >> >> 2010/11/15 Humair Mohammed <humairm@xxxxxxxxxxx>: >> > I have 2 tables with a 200,000 rows of data 3 character/string columns >> > ID, >> > Question and Response. The query below compares the data between the 2 >> > tables based on ID and Question and if the Response does not match >> > between >> > the left table and the right table it identifies the ID's where there is >> > a >> > mismatch. Running the query in SQL Server 2008 using the ISNULL function >> > take a few milliseconds. Running the same query in Postgresql takes over >> > 70 >> > seconds. The 2 queries are below: >> > SQL Server 2008 R2 Query >> > select t1.id from table1 t1 inner join table2 t2 on t1.id = t2.id and >> > t1.question = t2.question and isnull(t1.response,'ISNULL') <> >> > isnull(t2.response,'ISNULL') >> >> > Postgres 9.1 Query >> > select t1.id from table1 t1 inner join table2 t2 on t1.id = t2.id and >> > t1.question = t2.question and coalesce(t1.response,'ISNULL') <> >> > coalesce(t2.response,'ISNULL') >> > What gives? >> >> I think, so must problem can be in ugly predicate >> coalesce(t1.response,'ISNULL') <> >> > coalesce(t2.response,'ISNULL') >> >> try use a IS DISTINCT OF operator >> >> ... AND t1.response IS DISTINCT t2.response >> >> Regards >> >> Pavel Stehule >> >> p.s. don't use a coalesce in WHERE clause if it is possible. > -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance