Hi everyone, I'm trying to understand the differences of some bandwidth and IOPs test results I see while running a random-write full-stripe-width aligned fio test (using libaio with direct IO) on a hardware RAID 6 raw device versus on the same device with the XFS file system on top of it. On the raw device I get: write: io=24828MB, bw=423132KB/s, iops=137, runt= 60085msec With XFS on top of it: write: io=14658MB, bw=249407KB/s, iops=81, runt= 60182msec The hardware RAID 6 volume consists out of 5 HDDs (3 data disks), the stripe unit size is 1 MiB and full stripe width is 3 MiB. XFS was initialized and mounted with the following commands: mkfs.xfs -d su=1024k,sw=3 -L LV-TEST-02 /dev/sdd mount -o inode64,noatime -L LV-TEST-02 /mnt/lv-test-02 mkfs.xfs version 3.2.2 xfs_info /mnt/lv-test-02 meta-data=/dev/sdd isize=256 agcount=16, agsize=819200 blks = sectsz=4096 attr=2, projid32bit=1 = crc=0 finobt=0 data = bsize=4096 blocks=13106688, imaxpct=25 = sunit=256 swidth=768 blks naming =version 2 bsize=4096 ascii-ci=0 ftype=0 log =internal bsize=4096 blocks=6399, version=2 = sectsz=4096 sunit=1 blks, lazy-count=1 realtime =none extsz=4096 blocks=0, rtextents=0 The RAID controller does not export optimal_io_size. Controller details: Product Name = AVAGO 3108 MegaRAID FW Package Build = 24.9.0-0022 BIOS Version = 6.25.03.0_4.17.08.00_0x060E0300 FW Version = 4.290.00-4536 Driver Name = megaraid_sas Driver Version = 06.808.16.00-rc1 Virtual drive: -------------------------------------------------------------------- DG/VD TYPE State Access Consist Cache sCC Size Name -------------------------------------------------------------------- 1/2 RAID6 Optl RW Yes RWBD - 49.998 GB vd-hdd-test-01 -------------------------------------------------------------------- R=Read Ahead WB=WriteBack (with a battery backup) D=Direct IO The physical disk cache is disabled. Disks: 5x HGST HUH728080AL5200 (Firmware Revision = A515) Kernel: 4.4.0-2.el7.elrepo.x86_64 fio command and output for RAID raw device: fio --filename=/dev/sdd \ --direct=1 \ --rw=randwrite \ --ioengine=libaio \ --iodepth=16 \ --numjobs=1 \ --runtime=60 \ --exec_prerun="/opt/MegaRAID/storcli/storcli64 /c0 flushcache" \ --name=direct-raid-hdd-random-write-full-stripe-aligned-3072k \ --bs=3072k direct-raid-hdd-random-write-full-stripe-aligned-3072k: (g=0): rw=randwrite, bs=3M-3M/3M-3M/3M-3M, ioengine=libaio, iodepth=16 fio-2.2.8 Starting 1 process direct-raid-hdd-random-write-full-stripe-aligned-3072k : Saving output of prerun in direct-raid-hdd-random-write-full-stripe-aligned-3072k.prerun.txt Jobs: 1 (f=1): [w(1)] [100.0% done] [0KB/375.0MB/0KB /s] [0/125/0 iops] [eta 00m:00s] direct-raid-hdd-random-write-full-stripe-aligned-3072k: (groupid=0, jobs=1): err= 0: pid=1847: Fri Jan 29 11:47:17 2016 write: io=24828MB, bw=423132KB/s, iops=137, runt= 60085msec slat (usec): min=250, max=91308, avg=7250.27, stdev=11767.73 clat (msec): min=8, max=223, avg=108.89, stdev=32.22 lat (msec): min=8, max=224, avg=116.14, stdev=32.19 clat percentiles (msec): | 1.00th=[ 9], 5.00th=[ 43], 10.00th=[ 78], 20.00th=[ 91], | 30.00th=[ 99], 40.00th=[ 106], 50.00th=[ 113], 60.00th=[ 119], | 70.00th=[ 126], 80.00th=[ 133], 90.00th=[ 145], 95.00th=[ 153], | 99.00th=[ 169], 99.50th=[ 176], 99.90th=[ 198], 99.95th=[ 202], | 99.99th=[ 225] bw (KB /s): min=348681, max=2599384, per=100.00%, avg=423757.22, stdev=204979.51 lat (msec) : 10=4.22%, 20=0.76%, 50=0.19%, 100=25.86%, 250=68.97% cpu : usr=2.49%, sys=3.57%, ctx=2959, majf=0, minf=1642 IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=99.8%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0% issued : total=r=0/w=8276/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0 latency : target=0, window=0, percentile=100.00%, depth=16 Run status group 0 (all jobs): WRITE: io=24828MB, aggrb=423131KB/s, minb=423131KB/s, maxb=423131KB/s, mint=60085msec, maxt=60085msec Disk stats (read/write): sdd: ios=59/98996, merge=0/0, ticks=150/8434406, in_queue=8443626, util=100.00% fio command and output for XFS: fio --directory=/mnt/lv-test-02 \ --filename=test.fio \ --size=30g \ --direct=1 \ --rw=randwrite \ --ioengine=libaio \ --iodepth=16 \ --numjobs=1 \ --runtime=60 \ --exec_prerun="/opt/MegaRAID/storcli/storcli64 /c0 flushcache" \ --name=xfs-ĥdd-random-write-full-stripe-aligned-3072k \ --bs=3072k xfs-ĥdd-random-write-full-stripe-aligned-3072k: (g=0): rw=randwrite, bs=3M-3M/3M-3M/3M-3M, ioengine=libaio, iodepth=16 fio-2.2.8 Starting 1 process xfs-ĥdd-random-write-full-stripe-aligned-3072k: Laying out IO file(s) (1 file(s) / 30720MB) xfs-ĥdd-random-write-full-stripe-aligned-3072k : Saving output of prerun in xfs-ĥdd-random-write-full-stripe-aligned-3072k.prerun.txt Jobs: 1 (f=1): [w(1)] [100.0% done] [0KB/186.0MB/0KB /s] [0/62/0 iops] [eta 00m:00s] xfs-ĥdd-random-write-full-stripe-aligned-3072k: (groupid=0, jobs=1): err= 0: pid=1899: Fri Jan 29 11:50:21 2016 write: io=14658MB, bw=249407KB/s, iops=81, runt= 60182msec slat (usec): min=231, max=133647, avg=12279.35, stdev=20234.23 clat (msec): min=4, max=1987, avg=184.75, stdev=81.74 lat (msec): min=5, max=1987, avg=197.03, stdev=83.84 clat percentiles (msec): | 1.00th=[ 7], 5.00th=[ 8], 10.00th=[ 110], 20.00th=[ 143], | 30.00th=[ 161], 40.00th=[ 174], 50.00th=[ 188], 60.00th=[ 202], | 70.00th=[ 217], 80.00th=[ 237], 90.00th=[ 269], 95.00th=[ 293], | 99.00th=[ 363], 99.50th=[ 416], 99.90th=[ 742], 99.95th=[ 922], | 99.99th=[ 1991] bw (KB /s): min=130620, max=2460047, per=100.00%, avg=250120.43, stdev=212307.87 lat (msec) : 10=8.04%, 100=0.88%, 250=76.44%, 500=14.31%, 750=0.25% lat (msec) : 1000=0.04%, 2000=0.04% cpu : usr=1.10%, sys=2.30%, ctx=1891, majf=0, minf=1096 IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.2%, 16=99.7%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0% issued : total=r=0/w=4886/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0 latency : target=0, window=0, percentile=100.00%, depth=16 Run status group 0 (all jobs): WRITE: io=14658MB, aggrb=249406KB/s, minb=249406KB/s, maxb=249406KB/s, mint=60182msec, maxt=60182msec Disk stats (read/write): sdd: ios=0/58627, merge=0/12, ticks=0/8552722, in_queue=8559550, util=99.84% Many thanks in advance Chris _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs