Re: "SELECT .. WHERE NOT IN" query running for hours

Mladen Gogala <mladen.gogala@xxxxxxxxxxx> · Thu, 06 Jan 2011 23:25:45 -0500

On 1/6/2011 9:36 PM, Γιωργος Βαλκανας wrote:

1) Why is it taking *so* long for the first query (with the "NOT IN" ) 
to do even the simple select?
Because NOT IN has to execute the correlated subquery for every row and 
then check whether the requested value is in the result set, usually by 
doing sequential comparison. The NOT EXIST plan is also bad because 
there is no index but at least it can use very fast and efficient hash 
algorithm. Indexing the "hwdocid" column on the "Document" table or, 
ideally, making it a primary key, should provide an additional boost to 
your query. If you already do have an index, you may consider using 
enable_seqscan=false for this session, so that the "hwdocid" index will 
be used. It's a common wisdom that in the most cases NOT EXISTS will 
beat NOT IN. That is so all over the database world. I've seen that in 
Oracle applications, MS SQL applications and, of course MySQL 
applications. Optimizing queries is far from trivial.

Μλαδεν Γογαλα

--
Mladen Gogala
Sr. Oracle DBA
1500 Broadway
New York, NY 10036
(212) 329-5251
www.vmsinfo.com

--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance