On Tue, 2008-11-18 at 17:38 +0100, tv@xxxxxxxx wrote: > I bet there is no 'critical' length - this is just another case of > index > scan vs. seqscan. The efficiency depends on the size of the table / > row, > amount of data in the table, variability of the column used in the IN > clause, etc. > > Splitting the query with 1000 items into 10 separate queries, the > smaller > queries may be faster but the total time consumed may be actually > higher. > Something like > > 10 * (time of small query) + (time to combine them) > (time of large > query) > > If the performance of the 'split' solution is actually better than the > original query, it just means that the planner does not use index scan > when it actually should. That means that either > > (a) the planner is not smart enough > (b) it has not current statistics of the table (run ANALYZE on the > table) > (c) the statistics are not detailed enough (ALTER TABLE ... SET > STATICTICS) > (d) the cost variables are not set properly (do not match the hardware > - > decreate index scan cost / increase seq scan cost) > > regards > Tomas I know that it's much faster (for us) to run many smaller queries than one large query, and I think that it's primarily because of your reason a. Most of our problems come from Pg misunderstanding the results of a join and making a bad plan decision. Batching dramatically reduces the liklihood of this. -Mark -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance