Have you done a vacuum on the table recently? I would be curious to see how: select stuff from table where index_key = <key1> AND non_index_row in ('xyz','abc','def') UNION ALL select stuff from table where index_key = <key2> AND non_index_row in ('xyz','abc','def') ... UNION ALL select stuff from table where index_key = <key600> AND non_index_row in ('xyz','abc','def') performs by comparison. If, after a vacuum, it performs better than the IN list, then the IN list might benefit from a bit of analysis for better tuning chances. > -----Original Message----- > From: pgsql-general-owner@xxxxxxxxxxxxxx [mailto:pgsql-general- > owner@xxxxxxxxxxxxxx] On Behalf Of Stephen Harris > Sent: Wednesday, May 02, 2007 11:32 AM > To: Postgres General > Subject: [GENERAL] Have I b0rked something? Slow comparisons on "where x > in (...)" > > Postgres version 8.0.9 on Solaris 2.8. I know it's old but... > > I have a table with a million rows. > > I need to select data from this table based on an indexed column; I need > to select 600 possible values from the column, returning around 24,000 > rows of data. > > In perl I have a hash which has 600 key values in it. > > So I did: > > "select stuff from table where index_key in (" . > join(",",keys %hash) . ") AND non_index_row in ('xyz','abc','def') > > And in the perl > while(fetch()) > { > do_stuff > } > > This resulted in a query string which executed in 12 minutes. If I > did an "explain" on the query string then I can see it was being expanded > to 600 OR statements > where (index_key = 1) OR (index_key = 2) OR ..... > > > Now as an alternative option I did > select stuff from table where non_index_row in ('xyz','abc','def') > and in the perl > while(fetch()) > { > next unless $hash{$_->{index_key}}; > do_stuff > } > > To me this should be slower since we're selecting more rows, throwing > the data back to the perl and then discarding values I didn't want. > > Imagine my surprise when the result took 3 minutes to execute. > > Have I broken something, somewhere? Or are IN comparisons really that > slow? > > For what it's worth, a simple > > explain select count(*) from table where index_key in (1,2,3,4,....) > > uses the index up until 156 values but then switches to sequential scan > when there are 157 or more values in query. > > Any thoughts? I fear my poor tuning attempts may have caused other > slow downs! > > -- > > rgds > Stephen > > ---------------------------(end of broadcast)--------------------------- > TIP 1: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@xxxxxxxxxxxxxx so that your > message can get through to the mailing list cleanly