On Mon, Jan 11, 2010 at 9:42 AM, Bob Dusek <redusek@xxxxxxxxx> wrote: >> What method of striped RAID? RAID-5? RAID-10? RAID-4? RAID-0? > > RAID-0 Just wondering how many drives? > To be more specific about the degradation, we've set the > "log_min_duration_statement=200", and when we run with 40 concurrent > requests, we don't see queries showing up in there. When we run with 60 > concurrent requests, we start seeing queries show up, and when we run 200+ > requests, we see multi-second queries. > > This is to be expected, to some extent, as we would expect some perfromance > degradation with higher utilization. But, the hardware doesn't appear to be > very busy, and that's where we're hoping for some help. It's likely in io wait. >> What do the following commands tell you? >> >> iostat -x 10 (first iteration doesn't count) > > Here's some iostat output (from the 3rd data point from iostat -x 10)... > this was taken while we were processing 256 simultaneous requests. > > avg-cpu: %user %nice %system %iowait %steal %idle > 34.29 0.00 7.09 0.03 0.00 58.58 > > Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz > avgqu-sz await svctm %util > sda 0.00 112.20 0.00 133.40 0.00 1964.80 14.73 > 0.42 3.17 0.04 0.48 > sda1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 > 0.00 0.00 0.00 0.00 > sda2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 > 0.00 0.00 0.00 0.00 > sda3 0.00 0.00 0.00 0.00 0.00 0.00 0.00 > 0.00 0.00 0.00 0.00 > sda4 0.00 0.00 0.00 0.00 0.00 0.00 0.00 > 0.00 0.00 0.00 0.00 > sda5 0.00 112.20 0.00 133.40 0.00 1964.80 14.73 > 0.42 3.17 0.04 0.48 > sdb 0.00 0.00 0.00 0.00 0.00 0.00 0.00 > 0.00 0.00 0.00 0.00 > dm-0 0.00 0.00 0.00 0.40 0.00 3.20 8.00 > 0.00 0.00 0.00 0.00 > dm-1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 > 0.00 0.00 0.00 0.00 > dm-2 0.00 0.00 0.00 99.90 0.00 799.20 8.00 > 0.15 1.50 0.01 0.12 > dm-3 0.00 0.00 0.00 0.60 0.00 4.80 8.00 > 0.00 0.33 0.17 0.01 > dm-4 0.00 0.00 0.00 144.70 0.00 1157.60 8.00 > 0.46 3.17 0.02 0.35 > dm-5 0.00 0.00 0.00 0.00 0.00 0.00 0.00 > 0.00 0.00 0.00 0.00 > > The iowait seems pretty low, doesn't it? Depends, is that the first iteration of output? if so, ignore it and show me the second and further on. Same for vmstat... In fact let them run for a minute or two and attach the results... OTOH, if that is the second or later set of output, then you're definitely not IO bound, and I don't see why the CPUs are not being better utilized. How many concurrent queries are you running when you take these measurements? Can you take them with lower and higher numbers of concurrent users and compare the two? normally I'd be looking for context switching taking more and more time in a heavily loaded system, but I'm not seeing it in your vmstat numbers either. What are your settings for effective_cache_size random_page_cost work_mem with your machine and the extra memory, you can probably uptune the work_mem to 8 Megs safely if it's at the default of 1MB. With a database that fits in RAM, you can often turn down random_page_cost to near 1.0 (1.2 to 1.4 is common for such scenarios.) And effective cache size being larger (in the 20G range) will hint the planner that it's likely to find everything it needs in ram somewhere and not on the disk. There are several common bottlenecks you can try to tune away from. IO doesn't look like a problem for you. Neither does CPU load. So, then we're left with context switching time and memory to CPU bandwidth. If your CPUs are basically becoming data pumps then the speed of your FSB becomes VERY critical, and some older Intel mobos didn't have a lot of CPU to Memory bandwidth and adding CPUs made it worse, not better. More modern Intel chipsets have much faster CPU to Memory BW, since they're using the same kind of fabric switching that AMD uses on highly parallel machines. If your limit is your hardware, then the only solution is a faster machine. It may well be that a machine with dual fast Nehalem (2.4GHz+) quad core CPUs will be faster. Or 4 or 8 AMD CPUs with their faster fabric. -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance