Re: Tracing IO requests?

Dave Sullivan <dsulliva@redhat.com> · Wed, 02 Mar 2011 18:00:53 -0500

http://sourceware.org/systemtap/examples/

look at traceio.stp and disktop.stp

http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/5/html/SystemTap_Beginners_Guide/index.html

On 03/02/2011 05:19 PM, Bart Kus wrote:
On 3/2/2011 12:13 PM, Jonathan Tripathy wrote:
I once used a tool called dstat. dstat has modules which can tell you 
which processes are using disk IO. I haven’t used dstat in a while so 
maybe someone else can chime in

I know the IO is only being caused by a "cp -a" command, but the issue 
is why all the reads?  It should be 99% writes.  Another thing I 
noticed is the average request size is pretty small:

14:06:20          DEV       tps  rd_sec/s  wr_sec/s  avgrq-sz  
avgqu-sz     await     svctm     %util
[...snip!...]
14:06:21          sde    219.00  11304.00  30640.00    191.53      
1.15      5.16      2.10     46.00
14:06:21          sdf    209.00  11016.00  29904.00    195.79      
1.06      5.02      2.01     42.00
14:06:21          sdg    178.00  11512.00  28568.00    225.17      
0.74      3.99      2.08     37.00
14:06:21          sdh    175.00  10736.00  26832.00    214.67      
0.89      4.91      2.00     35.00
14:06:21          sdi    206.00  11512.00  29112.00    197.20      
0.83      3.98      1.80     37.00
14:06:21          sdj    209.00  11264.00  30264.00    198.70      
0.79      3.78      1.96     41.00
14:06:21          sds    214.00  10984.00  28552.00    184.75      
0.78      3.60      1.78     38.00
14:06:21          sdt    194.00  13352.00  27808.00    212.16      
0.83      4.23      1.91     37.00
14:06:21          sdu    183.00  12856.00  28872.00    228.02      
0.60      3.22      2.13     39.00
14:06:21          sdv    189.00  11984.00  31696.00    231.11      
0.57      2.96      1.69     32.00
14:06:21          md5    754.00      0.00 153848.00    204.04      
0.00      0.00      0.00      0.00
14:06:21    DayTar-DayTar    753.00      0.00 153600.00    203.98     
15.73     20.58      1.33    100.00
14:06:21         data    760.00      0.00 155800.00    205.00   
4670.84   6070.91      1.32    100.00

Looks to be about 205 sectors/request, which is 104,960 bytes.  This 
might be causing read-modify-write cycles if for whatever reason md is 
not taking advantage of the stripe cache.  stripe_cache_active shows 
about 128 blocks (512kB) of RAM in use, per hard drive.  Given the 
chunk size is 512kB, and the writes being requested are linear, it 
should not be doing read-modify-write.  And yet, there are tons of 
reads being logged, as shown above.

A couple more confusing things:

jo ~ # blockdev --getss /dev/mapper/data
512
jo ~ # blockdev --getpbsz /dev/mapper/data
512
jo ~ # blockdev --getioopt /dev/mapper/data
4194304
jo ~ # blockdev --getiomin /dev/mapper/data
524288
jo ~ # blockdev --getmaxsect /dev/mapper/data
255
jo ~ # blockdev --getbsz /dev/mapper/data
512
jo ~ #

If optimum IO size is 4MBs (as it SHOULD be: 512k chunk * 8 data 
drives = 4MB stripe), but maxsect count is 255 (255*512=128k) how can 
optimal IO ever be done???  I re-mounted XFS with 
sunit=1024,swidth=8192 but that hasn't increased the average 
transaction size as expected.  Perhaps it's respecting this maxsect 
limit?

--Bart

PS: The RAID6 full stripe has +2 parity drives for a total of 10, but 
they're not included in the "data zone" definitions of stripe size, 
which are the only important ones for figuring out how large your 
writes should be.

_______________________________________________
linux-lvm mailing list
linux-lvm@redhat.com
https://www.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/

_______________________________________________
linux-lvm mailing list
linux-lvm@redhat.com
https://www.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/