Re: improving performance for a delete

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 20 May 2008 22:03:30 +0200, kevin kempter <kevin@xxxxxxxxxxxxxxxxxxx> wrote:

Version 8.3.1


On May 20, 2008, at 1:51 PM, kevin kempter wrote:

Hi all;

I have 2 tables where I basically want to delete from the first table (seg_id_tmp7) any rows where the entire row already exists in the second table (sl_cd_segment_dim)

I have a query that looks like this (and it's slow):


delete from seg_id_tmp7
where
	customer_srcid::text ||

Besides being slow as hell and not able to use any indexes, the string concatenation can also yield incorrect results, for instance :

season_name::text || episode_srcid::text

	Will have the same contents for

season_name='season 1' episode_srcid=12
season_name='season 11' episode_srcid=2

	I suggest doing it the right way, one possibility being :

test=> EXPLAIN DELETE from test where (id,value) in (select id,value from test2);
                               QUERY PLAN
-------------------------------------------------------------------------
 Hash IN Join  (cost=2943.00..6385.99 rows=2 width=6)
   Hash Cond: ((test.id = test2.id) AND (test.value = test2.value))
   ->  Seq Scan on test  (cost=0.00..1442.99 rows=99999 width=14)
   ->  Hash  (cost=1443.00..1443.00 rows=100000 width=8)
         ->  Seq Scan on test2  (cost=0.00..1443.00 rows=100000 width=8)

Thanks to the hash it is very fast, one seq scan on both tables, instead of one seq scan PER ROW in your query.

	Another solution would be :

test=> EXPLAIN DELETE FROM test USING test2 WHERE test.id=test2.id AND test.value=test2.value;
                               QUERY PLAN
-------------------------------------------------------------------------
 Hash Join  (cost=2943.00..6385.99 rows=2 width=6)
   Hash Cond: ((test.id = test2.id) AND (test.value = test2.value))
   ->  Seq Scan on test  (cost=0.00..1442.99 rows=99999 width=14)
   ->  Hash  (cost=1443.00..1443.00 rows=100000 width=8)
         ->  Seq Scan on test2  (cost=0.00..1443.00 rows=100000 width=8)
	
Which chooses the same plan here, quite logically, as it is the best one in this particular case.


[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux