Re: Finding rows in table T1 that DO NOT MATCH any row in table T2

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Tom,

1. Actually I just tested you suggestion

SELECT COUNT (*) FROM T1 where NOT EXISTS
(SELECT 1 FROM T2 where T1.PK = T2.FK)

and in worked in PG 8.3.8. On a DB with 6M T1 records and 5M T2 records it took 1m8s,

My suggestion, i.e.

SELECT COUNT(*) FROM T1 LEFT JOIN T2 ON T1.PK = T2.FK
WHERE T2.FK IS NULL

was about twice as fast, 37s. (both returned same number of rows, about 2/3 of T1)

However I can use DELETE with your version (instead of "SELECT COUNT (*)" above) but not with mine (can't have LEFT JOIN in DELETE), so YOU WIN. Thanks!

2. BTW. I presented my question earlier in an overly simplified fashion. Sorry. In actuality the two tables are joined on two columns,
say Ka and Kb (a composite key column), e.g. T1.PKa = T2.FKa and T1.PKb = T2.FKb. So the IN versions suggested will not work
since AFAIK IN only works for a single value.

-- Shaul

On Tue, Oct 20, 2009 at 3:59 PM, Tom Lane <tgl@xxxxxxxxxxxxx> wrote:
Shaul Dar <shauldar@xxxxxxxxx> writes:
> I assume this will work but will take a long time:

> DELETE * FROM T1 where T1.PK NOT IN
> (SELECT T1.PK FROM T1, T2 where T1.PK = T2.FK)

Well, yeah, but it's unnecessarily inefficient --- why not just

DELETE FROM T1 where T1.PK NOT IN
(SELECT T2.FK FROM T2)

However, that still won't be tremendously fast unless the subselect fits
in work_mem.  As of 8.4 this variant should be reasonable:

DELETE FROM T1 where NOT EXISTS
(SELECT 1 FROM T2 where T1.PK = T2.FK)

Pre-8.4 you should resort to the "left join where is null" trick,
but there's no need to be so obscure as of 8.4.

                       regards, tom lane


[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux