http://sourceware.org/systemtap/examples/
look at traceio.stp and disktop.stp
http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/5/html/SystemTap_Beginners_Guide/index.html
On 03/02/2011 05:19 PM, Bart Kus wrote:
On 3/2/2011 12:13 PM, Jonathan Tripathy wrote:
I once used a tool called dstat. dstat has modules which can tell you
which processes are using disk IO. I haven’t used dstat in a while so
maybe someone else can chime in
I know the IO is only being caused by a "cp -a" command, but the issue
is why all the reads? It should be 99% writes. Another thing I
noticed is the average request size is pretty small:
14:06:20 DEV tps rd_sec/s wr_sec/s avgrq-sz
avgqu-sz await svctm %util
[...snip!...]
14:06:21 sde 219.00 11304.00 30640.00 191.53
1.15 5.16 2.10 46.00
14:06:21 sdf 209.00 11016.00 29904.00 195.79
1.06 5.02 2.01 42.00
14:06:21 sdg 178.00 11512.00 28568.00 225.17
0.74 3.99 2.08 37.00
14:06:21 sdh 175.00 10736.00 26832.00 214.67
0.89 4.91 2.00 35.00
14:06:21 sdi 206.00 11512.00 29112.00 197.20
0.83 3.98 1.80 37.00
14:06:21 sdj 209.00 11264.00 30264.00 198.70
0.79 3.78 1.96 41.00
14:06:21 sds 214.00 10984.00 28552.00 184.75
0.78 3.60 1.78 38.00
14:06:21 sdt 194.00 13352.00 27808.00 212.16
0.83 4.23 1.91 37.00
14:06:21 sdu 183.00 12856.00 28872.00 228.02
0.60 3.22 2.13 39.00
14:06:21 sdv 189.00 11984.00 31696.00 231.11
0.57 2.96 1.69 32.00
14:06:21 md5 754.00 0.00 153848.00 204.04
0.00 0.00 0.00 0.00
14:06:21 DayTar-DayTar 753.00 0.00 153600.00 203.98
15.73 20.58 1.33 100.00
14:06:21 data 760.00 0.00 155800.00 205.00
4670.84 6070.91 1.32 100.00
Looks to be about 205 sectors/request, which is 104,960 bytes. This
might be causing read-modify-write cycles if for whatever reason md is
not taking advantage of the stripe cache. stripe_cache_active shows
about 128 blocks (512kB) of RAM in use, per hard drive. Given the
chunk size is 512kB, and the writes being requested are linear, it
should not be doing read-modify-write. And yet, there are tons of
reads being logged, as shown above.
A couple more confusing things:
jo ~ # blockdev --getss /dev/mapper/data
512
jo ~ # blockdev --getpbsz /dev/mapper/data
512
jo ~ # blockdev --getioopt /dev/mapper/data
4194304
jo ~ # blockdev --getiomin /dev/mapper/data
524288
jo ~ # blockdev --getmaxsect /dev/mapper/data
255
jo ~ # blockdev --getbsz /dev/mapper/data
512
jo ~ #
If optimum IO size is 4MBs (as it SHOULD be: 512k chunk * 8 data
drives = 4MB stripe), but maxsect count is 255 (255*512=128k) how can
optimal IO ever be done??? I re-mounted XFS with
sunit=1024,swidth=8192 but that hasn't increased the average
transaction size as expected. Perhaps it's respecting this maxsect
limit?
--Bart
PS: The RAID6 full stripe has +2 parity drives for a total of 10, but
they're not included in the "data zone" definitions of stripe size,
which are the only important ones for figuring out how large your
writes should be.
_______________________________________________
linux-lvm mailing list
linux-lvm@redhat.com
https://www.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
_______________________________________________
linux-lvm mailing list
linux-lvm@redhat.com
https://www.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/