On Thu, Oct 18, 2012 at 1:50 PM, Jeff Janes <jeff.janes@xxxxxxxxx> wrote: > On Wed, Oct 10, 2012 at 10:36 PM, Scott Marlowe <scott.marlowe@xxxxxxxxx> wrote: >> On Wed, Oct 10, 2012 at 4:37 PM, Claudio Freire <klaussfreire@xxxxxxxxx> wrote: >>> >>> In my case, if I set it too high, I get impossibly suboptimal plans >>> when an index scan over millions of rows hits the disk way too often >>> way too randomly. The difference is minutes for a seqscan vs hours for >>> the index scan. In fact, I prefer setting it too low than too high. >> >> There's a corollary for very fast disk subsystems. If you've got say >> 40 15krpm disks in a RAID-10 you can get sequential read speeds into >> the gigabytes per second, so that sequential page access costs MUCH >> lower than random page access, to the point that if seq page access is >> rated a 1, random page access should be much higher, sometimes on the >> order of 100 or so. > > On the other hand, if you have 40 very busy connections, then if they > are all doing sequential scans on different tables they will interfere > with each other and will have to divide up the RAID throughput, while > if they are doing random fetches they will get along nicely on that > RAID. So you have to know how much concurrency of the relevant type > you expect to see. My experience is that both read ahead and caching will make it more resilient than that, to the point that it can take several times the number of read clients than the number of spindles before things get as slow as random access. While it's a performance knee to be aware of, often by the time you have enough clients for it to matter you've already maxxed out either memory bw or all the CPU cores. But again, this is with dozens to hundreds of disk drives. Not four or eight etc. And it's very access pattern dependent. On the last machine I had where we cranked up random versus sequential costs, it was a reporting server with 5 or 6TB of data that regularly got trundled through regularly for aggregation. Daily slices of data, in partitions were 1GB to 10GB. For this machine the access patterns were almost never random. Just a handful of queries running in a random access pattern could interfere with a 5 minute long sequential reporting query and make it suddenly take hours. It had 16x7200rpm 2TB drives and would regularly hand three or four queries at a time, all running plenty fast. But crank up one thread of pgbench and it would slow to a crawl. -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance