On Thu, Nov 17, 2011 at 12:23 AM, Tory M Blue <tmblue@xxxxxxxxx> wrote: >> What do you mean by "nothing"? There are 3060 reads/s, servicing each one >> takes 0.33 ms - that means the drive is 100% utilized. >> >> The problem with the iostat results you've posted earlier is that they >> either use "-xd" or none of those switches. That means you can's see CPU >> stats and extended I/O stats at the same time - use just "-x" next time. >> >> Anyway the results show that "%iowait" is about 6% - as Scott Marlowe >> pointed out, this means 1 core is waiting for I/O. That's the core running >> your query. Try to execute the query 16x and you'll see the iowait is >> 100%. > > Yes this I understand and is correct. But I'm wrestling with the idea > that the Disk is completely saturated. I've seen where I actually run > into high IO/Wait and see that load climbs as processes stack. > > I'm not arguing (please know this), I appreciate the help and will try > almost anything that is offered here, but I think if I just threw > money at the situation (hardware), I wouldn't get any closer to > resolution of my issue. I am very interested in other solutions and > more DB structure changes etc. But remember, you're doing all that in a single query. So your disk subsystem might even be able to perform even more *througput* if it was given many more concurrent request. A big raid10 is really good at handling multiple concurrent requests. But it's pretty much impossible to saturate a big raid array with only a single read stream. With a single query, the query can only run as fast as the single stream of requests can be satisfied. And as the next read is issued as soon as the previous is done (the kernel readahead/buffering the seq scan helps here), your iostat is going to show 100% util, because the there is always the next read "in progress", even if the average queue size is low (1). If you had a 24 spindle array, you could add another 20 queries, and you could see the queue size go up, but the util would still only be 100%, latency would stay about the same, even though your throughput could be 20 times greater. So, as long as you have a single query scanning that entire 83GB table, and that table has to come from disk (i.e. not cached kernel buffers in ram), you're going to be limited by the amount of time it takes to read that table in 8K chunks. Options for improving it are: 1) Making sure your array/controller/kernel are doing the maximum read-ahead/buffering possible to make reading that 83GB as quick as possible 2) Changing the query to not need to scan all 83GB. #2 is where you're going to see orders-of-magnitude differences in performance, and there are lots of options there. But because there are so many options, and so many variables in what type of other queries, inserts, updates, and deletes are done on the data, no one of them is necessarily "the best" for everyone. But if you have the ability to alter queries/schema slightly, you've got lots of avenues to explore ;-) And folks here are more than willing to offer advice and options that may be *very* fruitful. 1) Multicolumn index (depending on insert/update/delete patterns) 2) partition by date (depending on query types) 3) rollup views of history (depending on query types) 4) trigger based mat-view style rollups (depending on insert/update/delete patterns coupled with query types) a. -- Aidan Van Dyk Create like a god, aidan@xxxxxxxxxxx command like a king, http://www.highrise.ca/ work like a slave. -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance