On Mon, Jan 09, 2006 at 01:54:48PM -0500, Chris Hoover wrote: > Question, if I have a 4GB+ index for a table on a server with 4GB ram, and I > submit a query that does an index scan, does Postgres read the entire index, > or just read the index until it finds the matching value (our extra large > indexes are primary keys). Well, the idea behind an index is that if you need a specific value from it, you can get there very quickly, reading a minimum of data along the way. So basically, PostgreSQL won't normally read an entire index. > I am looking for real number to give to my boss the say either having a > primary key larger than our memory is bad (and how to clearly justfify it), > or it is ok. > > If it is ok, what are the trade offs in performance?\ > > Obviously, I want more memory, but I have to prove the need to my boss since > it raises the cost of the servers a fair amount. Well, if you add a sleep to the following code, you can tie up some amount of memory, which would allow you to simulate having less memory available. Though over time I think the kernel might decide to page that memory out, so it's not perfect. int main(int argc, char *argv[]) { if (!calloc(atoi(argv[1]), 1024*1024)) { printf("Error allocating memory.\n"); } } In a nutshell, PostgreSQL and the OS will generally work together to only cache data that is being used fairly often. In the case of a large PK index, if you're not actually reading a large distribution of the values in the index you probably aren't even caching the entire index even now. There may be some kind of linux tool that would show you what portion of a file is currently cached, which would help answer that question (but remember that hopefully whatever parts of the index are cached by PostgreSQL itself won't also be cached by the OS as well). -- Jim C. Nasby, Sr. Engineering Consultant jnasby@xxxxxxxxxxxxx Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461