"Albe Laurenz" <laurenz.albe@xxxxxxxxxx> writes: > On a database (PostgreSQL 8.2.4 on 64-bit Linux 2.6.18 on 8 AMD Opterons) > that is under high load, I observe the following: > ... > - "vmstat" shows that CPU time is divided between "idle" and "iowait", > with user and sys time practically zero. > - "sar" says that the disk with the database is on 100% of its capacity. It sounds like you've simply saturated the disk's I/O bandwidth. (I've noticed that Linux isn't all that good about distinguishing "idle" from "iowait" --- more than likely you're really looking at 100% iowait.) > Storage is on a SAN box. What kind of SAN box? You're going to need something pretty beefy to keep all those CPUs busy. > What puzzles me is the "strace -tt" output from that backend: Some low level of contention and consequent semops/context switches is to be expected. I don't think you need to worry if it's only 100/sec. The sort of "context swap storm" behavior we've seen in the past is in the tens of thousands of swaps/sec on hardware much weaker than what you have here --- if you were seeing one of those I bet you'd be well above 100000 swaps/sec. > Are the lseek and read operations really that fast although the disk is on 100%? lseek is (should be) cheap ... it doesn't do any actual I/O. The read()s you're showing here were probably satisfied from kernel disk cache. If you look at a larger sample you'll find slower ones, I think. Another thing to look for is slow writes. regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance