On Mon, 2005-10-24 at 16:57, felix@xxxxxxxxxxx wrote: > Having been surprised a few times myself by EXPLAIN showing a > sequential scan instead of using an index, and having seen so many > others surprised by it, I hope I am not asking a similar question. > > We recently upgraded our db servers, both old and new running 8.0, and > one casualty was forgetting to add the nightly VACUUM ANALYZE. > Inserts were down to 7-8 seconds apiece, but are now back to normal > under a second since the tables were vacuumed. > > However, in the process of investigating this, my boss found something > which we do not understand. A table with a primary key 'id' takes 200 > seconds to SELECT MAX(id), but is as close to instantaneous as you'd > want for SELECT ID ORDER BY ID DESC LIMIT 1. I understand why > count(*) has to traverse all records, but why does MAX have to? This > table has about 750,000 rows, rather puny. > > I suspect there is either a FAQ which I missed, or no one can answer > without EXPLAIN printouts. I'm hoping there is some generic answer to > something simple I have overlooked. It may be, but it's a complex enough question, I'll toss out the answer, as I understand it. The problems with aggregates extends from two design decisions made in PostgreSQL. 1: Generic aggregation PostgreSQL uses a generic aggregation system. I.e. there ain't no short cuts in the query planner for one or another of the aggregates. If you make an aggregate function called std_dev() and implement it, it pretty much gets treated the same by the query planner as the ones that are built in. So, to the query planner, max(fieldname) is about the same as sum(fieldname). Now, you and I can tell by looking at them that one of those could be answered pretty quickly with an index, but how's the planner supposed to know? 2: MVCC PostgreSQL's implementation of an "in store" Multi-version Concurrency Control system means that indexes only tell you where the index entry is in the table, not whether or not it is visible to this particular transaction. Depending on when the tuple was last updated, and when our transactions started, and our transaction isolation level, a given version of a given tuple may or may not be visible to our transaction. So, while PostgreSQL can use an index to find things, it always has to go back to the actual table to find out if the value is visible to the current transaction. Put simply, reads cost more, but writes don't make the performance of the db plummet. Put more simply, everyone gets a medium slow read performance for certain ops, but writes keep streaming right along. The Ain't No Such Thing As A Free Lunch (TANSTAAFL). Notice that you CAN use an index for most aggregates, as long as the where clause you're using is limiting enough. select count(*) from sometable where somefield > 22 can use an index if somefield > 22 is selective enough. But again, if the db is going to read some base percentage of the pages in the main table, it's cheaper to just switch to a sequential scan than to use the index, since it HAS TO READ all the values in the table to see if they're visible. The good news is that generally speaking, everything else in PostgreSQL is quite fast, and parallelism is very good. Hope that's not too involved, and Tom, hope I didn't make too many mistakes there... ---------------------------(end of broadcast)--------------------------- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to majordomo@xxxxxxxxxxxxxx so that your message can get through to the mailing list cleanly