Howard Cole wrote: > Thanks for the input Scott. You are correct - I am IO bound, but only > for the query described. 99% of the time, my IO runs at 3% or less, even > during peak times, only this one query, which happens approximately 10 > times a day grinds the system to a halt. If your I/O is normally that idle, surely one big query shouldn't stop everything? Slow it down, maybe, but stop it? Perhaps your RAID drivers, hardware (if any) or OS are deeply queuing requests and/or doing lots of readahead, probably to make sequential I/O benchmarks and random I/O throughput benchmarks that don't consider request latency look better. Consider turning down I/O queue depths if possible, and/or tuning readhead to something suitable for your I/O loads. The latter will require some work to find the right balance between random request latency and sequential I/O throughput. I had HUGE problems with a 3Ware 8500-8 RAID controller queuing requests very deeply inside the Linux driver its self, rather than the OS's I/O scheduler, causing high priority small I/O to be stuck behind long series of low priority bulk reads and writes since the driver wasn't aware of the OS's I/O priority mechanisms. I ended up modifying the driver a little to reduce the queue depth since there wasn't a runtime param for it, and the result was VASTLY improved I/O latency with only a very small cost to throughput. It'd be nice if the controller had the brains to be told "fetch this block, and read this one too if it happens to be on the way" ... but if there's anything like that out there, with hardware or driver level I/O priority awareness, I haven't run into it yet. Alas. By the way, it'd be REALLY nice if the postmaster had the facility to set CPU and I/O nice levels for the backends individually (on Linux and other supporting platforms). I'm currently using a user C function linked into the backend to set the nice level, but it'd be a great thing to have built-in. Would patches for this be considered, with the functions being no-ops (with warnings?) on non-supporting platforms? I also think it's a wee bit of a pity that there's no way to tell Pg that a job isn't important, so data shouldn't be permitted to push much else out of shared_buffers or the OS's cache. The latter can be ensured to an extent, at least on Linux, with posix_fadvise(..., POSIX_FADV_NOREUSE) or with madvise(...). The former is presumably possible with proper work_mem (etc) settings, but I find it's the OS's habit of filling the cache with gigabytes of data I won't need again that's the real problem. I don't know how this'd work when interacting with other backends doing other work with the same tables, though. -- Craig Ringer -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general