Huan Ruan wrote: > Interesting to see how you derived 100% cache hits. I assume by 'cache' you > mean the pg shared buffer plus the OS cache? Because the table is 23GB but > the shared buffer is only 6GB. Even then, I'm not completely convinced > because the total RAM is just 24GB, part of which will have to be used for > other data and indexes. Well, you can't have more than a few disk hits, which typically take something like 10 ms each, and still get an average less than 200 nanoseconds. > I read somewhere that a pg shared buffer that's too big can hurt the > performance and it's better just leave it to the OS cache. I'm not sure why > but for now, I just configured the shared buffer to be 1/4 of the total RAM. PostgreSQL goes through the OS and its filesystems, unlike some other products. The right balance of memory in the PostgreSQL shared buffers versus what is left for a combination of OS caching and other memory allocations can be hard to determine. 25% is a good starting point, but your best performance might be higher or lower. It's a good idea to try incremental adjustments using your actual workload. Just remember you need to allow enough for several maintenance_work_mem allocations, about one work_mem allocation per max_connections setting, plus a reasonable OS cache size. > I was wondering on our production server where the effetive_cache_size will > be much bigger, will pg then guess that probably most data is cached anyway > therefore leaning towards nested loop join rather than a scan for hash join? Once effective_cache_size is larger than your largest index, its exact value doesn't matter all that much. > Even on a test server where the cache hit rate is much smaller, for a big > table like this, under what circumstances, will a hash join perform better > than nested loop join though? With a low cache hit rate, that would generally be when the number of lookups into the table exceeds about 10% of the table's rows. > Yes, I had bumped up work_mem yesterday to speed up another big group by > query. I used 80MB. I assumed this memory will only be used if the query > needs it and will be released as soon as it's finished, so it won't be too > much an issue as long as I don't have too many concurrently sorting queries > running (which is true in our production). Is this correct? Each connection running a query can allocate one work_mem allocation per plan node (depending on node type), which will be freed after the query completes. A common "rule of thumb" is to plan on peaks of max_conncetions allocations of work_mem. > I increased maintenance_work_mem initially to speed up the index creation > when I first pump in the data. In production environment, we don't do run > time index creation, so I think only the vacuum and analyze will consume > this memory? You'll probably be creating indexes from time to time. Figure an occasional one of those plus up to one allocation per autovacuum worker (and you probably shouldn't go below three of those). -Kevin -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance