On a database (PostgreSQL 8.2.4 on 64-bit Linux 2.6.18 on 8 AMD Opterons) that is under high load, I observe the following: - About 200 database sessions concurrently issue queries, most of them small, but I have found one that touches 38000 table and index blocks. - "vmstat" shows that CPU time is divided between "idle" and "iowait", with user and sys time practically zero. - the run queue is short, but the blocked queue (uninterruptible sleep) is around 10. - Many context switches are reported (over hundred per second). - "sar" says that the disk with the database is on 100% of its capacity. Storage is on a SAN box. Queries that normally take seconds at most require up to an hour to finish. I ran "lsof -p" on a backend running the big query mentioned above, and it does not use any temp files (work_mem = 20MB). The query accesses only one table and its index. What puzzles me is the "strace -tt" output from that backend: 13:44:58.263598 semop(393227, 0x7fff482f6050, 1) = 0 13:44:58.313448 semop(229382, 0x7fff482f6070, 1) = 0 13:44:58.313567 semop(393227, 0x7fff482f6050, 1) = 0 13:44:58.442917 semop(229382, 0x7fff482f6070, 1) = 0 13:44:58.443074 semop(393227, 0x7fff482f6050, 1) = 0 13:44:58.565313 semop(393227, 0x7fff482f6050, 1) = 0 13:44:58.682178 semop(229382, 0x7fff482f6070, 1) = 0 13:44:58.682333 semop(393227, 0x7fff482f6480, 1) = 0 13:44:58.807452 semop(393227, 0x7fff482f6050, 1) = 0 13:44:58.924425 semop(393227, 0x7fff482f6480, 1) = 0 13:44:58.924727 semop(393227, 0x7fff482f6050, 1) = 0 13:44:59.045456 semop(393227, 0x7fff482f6050, 1) = 0 13:44:59.169011 semop(393227, 0x7fff482f6480, 1) = 0 13:44:59.169226 semop(327689, 0x7fff482f64a0, 1) = 0 [many more semops] 13:44:59.602532 semop(327689, 0x7fff482f6070, 1) = 0 13:44:59.602648 lseek(32, 120176640, SEEK_SET) = 120176640 13:44:59.602742 read(32, "{\0\0\0xwv\227\1\0\0\0\320\0\350\0\0 \3 \237\300\1@\236\300\1`\235\300\1"..., 8192) = 8192 13:44:59.602825 semop(327689, 0x7fff482f64d0, 1) = 0 13:44:59.602872 semop(393227, 0x7fff482f6080, 1) = 0 13:44:59.602929 semop(393227, 0x7fff482f6050, 1) = 0 13:44:59.614559 semop(360458, 0x7fff482f6070, 1) = 0 [many more semops] 13:44:59.742103 semop(229382, 0x7fff482f64a0, 1) = 0 13:44:59.742172 semop(393227, 0x7fff482f6050, 1) = 0 13:44:59.756526 select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout) 13:44:59.758096 semop(393227, 0x7fff482f6480, 1) = 0 13:44:59.771655 semop(393227, 0x7fff482f6050, 1) = 0 [hundreds of semops] 13:45:14.339905 semop(393227, 0x7fff482f6050, 1) = 0 13:45:14.466992 semop(360458, 0x7fff482f6070, 1) = 0 13:45:14.467102 lseek(33, 332693504, SEEK_SET) = 332693504 13:45:14.467138 read(33, "{\0\0\0\210\235\351\331\1\0\0\0\204\0010\32\360\37\3 \340\237 \0\320\237 \0\300\237 \0"..., 8192) = 8192 13:45:14.599815 semop(163844, 0x7fff482f60a0, 1) = 0 13:45:14.600006 lseek(32, 125034496, SEEK_SET) = 125034496 13:45:14.600305 read(32, "{\0\0\0\230\257\270\227\1\0\0\0\330\0\340\0\0 \3 \237\300\1@\236\300\1`\235\300\1"..., 8192) = 8192 13:45:14.600391 semop(163844, 0x7fff482f64d0, 1) = 0 13:45:14.600519 semop(393227, 0x7fff482f6480, 1) = 0 and so on. File 32 is the table, file 33 is the index. Many of the table and index blocks are probably already in shared memory (shared_buffers = 6GB) and don't have to be read from disk. My questions: Is the long duration of the query caused by something else than I/O overload? What are the semops? Lightweight locks waiting for shared buffer? Are the lseek and read operations really that fast although the disk is on 100%? Is this normal behavior under overload or is something ill tuned? Yours, Laurenz Albe -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)To make changes to your subscription:http://www.postgresql.org/mailpref/pgsql-performance