Thanks to everybody's input -- as a first-time poster to this listserv, I wasn't sure how long it would take to get a response. ;) I was frankly astonished to see that the composite index on (a,b) was used when I searched for (a), but Chris' response makes total sense. In this case, I don't want to go with a MAP due to the fact that I'm actually using Java Hibernate to generate this schema and access it. My sample query of using RANDOM() to select a random subset of the overall outlets was actually to try and defeat any prior caching of results, and give a more reasonable measurement -- but I didn't realize the implications. I had thought that coupled with a MAX clause at the end it would simply randomize and then bail out early instead of a full table scan - so thanks to Merlin for pointing that out. I'll go with a 2nd index on MEDIA_ID and do some measurements of speed increase, but it makes a lot more sense now. thank you Postgres gurus! :D regards Bill On 5/21/12 5:11 PM, Merlin Moncure wrote: > On Mon, May 21, 2012 at 3:36 PM, Dmitriy Igrishin <dmitigr@xxxxxxxxx> wrote: >>> So you can get fully index lookups on all of a, b, ab, and ba. the >>> primary key can't optimize ba because indexes only fully match if >>> candidate fields are supplied from left to right order. They can >>> still help somewhat, but to a lesser degree. >> BTW, I would like to know is it worth it to create 3rd index on map(a) >> to reduce the size of the index which will be used by the planer >> to save some server's RAM (obviously, at the cost of extra disk space) ? > What Dmitriy is talking about here is that even though an index on > (a,b) can efficiently (in terms of searching through the tree) match > terms on just 'a', you still pay a price because the entries on the > index have to store the data for b as well, So even though it's > algorithmically efficient you have to browse more data to do it which > pressures RAM. In other words, an index on just 'a' is ideal for > searches on just 'a', although a,b is much better than (b,a) or no > index at all. > > I personally think that generally it's better not to do that in most > cases especially if you're indexing integer keys since you're not > making *that* much difference on the overall index size. Also, > primary key indexes are much more likely to have to stay 'hot' in the > cache anyways since they will be serving fkey reference lookups and > stuff like that so in the end you might be consuming *more* ram, not > less. > > An exception might be if your key on a,b has a very small 'a' and a > very large 'b'. But that's pretty rare in practice and it's usually a > good idea to avoid indexing large fields if you can help it. It > really depends on the workload. > > merlin -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general