On Sat, 18 Aug 2007, Dag Wieers wrote: > On Fri, 17 Aug 2007, Mag Gam wrote: > > On 8/17/07, John R Pierce <pierce@xxxxxxxxxxxx> wrote: > > > Mag Gam wrote: > > > > > > > I have a server with 2 HBAs, and the users keeps complaining about > > > > performance problems. My question is, how can I relate the process > > > > with high I/O wait? Also, is it possible to see how much data is being > > > > pushed thru by my 2 HBAs? > > > > > > iostat (part of the sysstat package) will answer your 2nd question. > > > > > > I dunno how to measure io wait time per process. maybe IBM's NMON can > > > do that, not sure, I haven't used it for a while. > > > http://www-941.haw.ibm.com/collaboration/wiki/display/WikiPtype/nmon > > > > Thanks John. > > > > Yes, this is a tricky question, but I face this a lot....Unfortunately, I am > > not sure how to check the adapter throughput, and what process is causing > > the i/o wait. > > I believe that recent kernels have a patch applied that show io counters > per process. I haven't looked into it yet though. > > This is one of the most important items on my wishlist for dstat, a topio > plugin next to the existing topcpu and topmem plugins. I found the following interesting information while googling. Now I need to find a kernel that provides the counters ;-) Based on this information I will most likely have topio, topio_real and topio_ops 2.14 /proc/<pid>/io - Display the IO accounting fields ------------------------------------------------------- This file contains IO statistics for each running process Example ------- test:/tmp # dd if=/dev/zero of=/tmp/test.dat & [1] 3828 test:/tmp # cat /proc/3828/io rchar: 323934931 wchar: 323929600 syscr: 632687 syscw: 632675 read_bytes: 0 write_bytes: 323932160 cancelled_write_bytes: 0 Description ----------- rchar ----- I/O counter: chars read The number of bytes which this task has caused to be read from storage. This is simply the sum of bytes which this process passed to read() and pread(). It includes things like tty IO and it is unaffected by whether or not actual physical disk IO was required (the read might have been satisfied pagecache) wchar ----- I/O counter: chars written The number of bytes which this task has caused, or shall cause to be written to disk. Similar caveats apply here as with rchar. syscr ----- I/O counter: read syscalls Attempt to count the number of read I/O operations, i.e. syscalls like read() and pread(). syscw ----- I/O counter: write syscalls Attempt to count the number of write I/O operations, i.e. syscalls write() and pwrite(). read_bytes ---------- I/O counter: bytes read Attempt to count the number of bytes which this process really did cause to be fetched from the storage layer. Done at the submit_bio() level, so it is accurate for block-backed filesystems. <please add status regarding NFS and CIFS at a later time> write_bytes ----------- I/O counter: bytes written Attempt to count the number of bytes which this process caused to be sent to the storage layer. This is done at page-dirtying time. cancelled_write_bytes --------------------- The big inaccuracy here is truncate. If a process writes 1MB to a file and then deletes the file, it will in fact perform no writeout. But it will have been accounted as having caused 1MB of write. In other words: The number of bytes which this process caused to not happen, by truncating pagecache. A task can cause "negative" IO too. If this task truncates some dirty pagecache, some IO which another task has been accounted for (in it's write_bytes) will not be happening. We _could_ just subtract that from the truncating task's write_bytes, but there is information loss in doing that. Note ---- At its current implementation state, this is a bit racy on 32-bit machines: if process A reads process B's /proc/pid/io while process B is updating one of those 64-bit counters, process A could see an intermediate result. More information about this can be found within the taskstats documentation in Documentation/accounting. -- dag wieers, dag@xxxxxxxxxx, http://dag.wieers.com/ -- [Any errors in spelling, tact or fact are transmission errors] _______________________________________________ CentOS mailing list CentOS@xxxxxxxxxx http://lists.centos.org/mailman/listinfo/centos