Richard Neill <rn214@xxxxxxxxx> writes: > In particular, what happens in the following case: > 1. I start with have a table with 100 million rows, and column wid has > linearly distributed values from 45-90. (wid is indexed) > 2. I run vacuum analyze > 3. I insert about 2 million rows, all of which have the new wid of 91. > 4. I then do a select * WHERE wid = 91. > How smart is analyze? Will it actually say "well, I've never seen 91 in > this table, because all the values only go up to 90, so you'd better do > a sequential scan"? ANALYZE is not magic. The system won't know that the 91's are there until you re-ANALYZE (either manually or automatically). In a case like this I expect the planner would assume there are very few matching rows and go for an indexscan. That might still be the right thing given this specific scenario (need to fetch 2% of the table), but it certainly wouldn't be if you had say half of the table matching the query. Moral: re-ANALYZE after any bulk load. > On another note, I notice that if I ever manually run vacuum or analyze, > the performance of the database drops to the point where many of the > operators get kicked out. Is there any way to run them "nice" ? See vacuum_cost_delay. > We need to maintain a response time of under 1 second all day for simple > queries (which usually run in about 22ms). But Vacuum or Analyze seem to > lock up the system for a few minutes, during which other queries block > on them, although there is still plenty of CPU spare. It sounds to me like you don't really have enough disk I/O bandwidth to meet your performance requirements. All the CPU in the world won't help you if you didn't spend any money on the disks :-(. You might be able to alleviate this with vacuum_cost_delay, but it's a band-aid. regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance