Huge reduction in write bandwidth with filesystem vs direct block device

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi all,

We're working with NVME storage systems and are seeing a significant reduction in write speed with an XFS filesystem vs direct access to the block device.

Using a 5 disk software RAID5, we're able to get ~16GB/s write speed direct to the device.  If we put an XFS filesystem on the software RAID and run the same fio command (except --directory /xfs instead of --filename /dev/md11) we only get ~2.5GB/s write speed.

Are there any tunables that could improve this? Is performance degradation this big considered a bug?

The fio runs showing this are below:

*******Direct to /dev/md11 block device
[root@flashstore ~]# fio --filename=/dev/md11 --rw=write --numjobs=32 --size=12G --bs=1M --name=1m --group_reporting 1m: (g=0): rw=write, bs=(R) 1024KiB-1024KiB, (W) 1024KiB-1024KiB, (T) 1024KiB-1024KiB, ioengine=psync, iodepth=1
...
fio-3.7
Starting 32 processes
Jobs: 26 (f=26): [f(1),_(1),f(16),_(1),f(3),_(2),f(3),_(1),f(2),_(1),f(1)][100.0%][r=0KiB/s,w=0KiB/s][r=0,w=0 IOPS][eta 00m:00s]
1m: (groupid=0, jobs=32): err= 0: pid=74592: Mon Jan 25 12:13:28 2021
  write: IOPS=15.4k, BW=15.0GiB/s (16.1GB/s)(384GiB/25551msec)
    clat (usec): min=230, max=31691, avg=2044.43, stdev=778.36
     lat (usec): min=245, max=31710, avg=2067.97, stdev=783.21
    clat percentiles (usec):
     |  1.00th=[  420],  5.00th=[ 1745], 10.00th=[ 1811], 20.00th=[ 1860],
     | 30.00th=[ 1893], 40.00th=[ 1926], 50.00th=[ 1942], 60.00th=[ 1975],
     | 70.00th=[ 2024], 80.00th=[ 2089], 90.00th=[ 2180], 95.00th=[ 2900],
     | 99.00th=[ 4490], 99.50th=[ 4883], 99.90th=[13829], 99.95th=[14746],
     | 99.99th=[20841]
   bw (  KiB/s): min=400606, max=679936, per=3.13%, avg=492489.85, stdev=53436.36, samples=1632    iops        : min=  391, max=  664, avg=480.90, stdev=52.18, samples=1632
  lat (usec)   : 250=0.01%, 500=1.72%, 750=0.80%, 1000=0.09%
  lat (msec)   : 2=62.47%, 4=32.77%, 10=1.94%, 20=0.20%, 50=0.01%
  cpu          : usr=1.37%, sys=62.91%, ctx=38028757, majf=0, minf=60496
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%      submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%      complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,393216,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
  WRITE: bw=15.0GiB/s (16.1GB/s), 15.0GiB/s-15.0GiB/s (16.1GB/s-16.1GB/s), io=384GiB (412GB), run=25551-25551msec

Disk stats (read/write):
    md11: ios=98/2237881, merge=0/0, ticks=0/0, in_queue=0, util=0.00%, aggrios=10252/73455, aggrmerge=20863/582117, aggrticks=4425/138224, aggrin_queue=130116, aggrutil=17.71%   nvme2n1: ios=12427/88141, merge=25370/698549, ticks=5030/163534, in_queue=153382, util=16.71%   nvme3n1: ios=12210/88148, merge=24728/698544, ticks=4979/162745, in_queue=152592, util=16.84%   nvme4n1: ios=12246/88150, merge=24861/698524, ticks=4875/165703, in_queue=156034, util=16.81%   nvme5n1: ios=12289/88146, merge=25200/698533, ticks=5013/164900, in_queue=154398, util=16.96%   nvme6n1: ios=12343/88149, merge=25021/698553, ticks=6655/172464, in_queue=164291, util=17.71%
  nvme22n1: ios=0/0, merge=0/0, ticks=0/0, in_queue=0, util=0.00%

******* mkfs.xfs on /dev/md11 (w/ no flags) and fio run on that mount
[root@flashstore ~]# fio --directory=/xfs --rw=write --numjobs=32 --size=12G --bs=1M --name=1m --group_reporting 1m: (g=0): rw=write, bs=(R) 1024KiB-1024KiB, (W) 1024KiB-1024KiB, (T) 1024KiB-1024KiB, ioengine=psync, iodepth=1
...
fio-3.7
Starting 32 processes
1m: Laying out IO file (1 file / 12288MiB)
1m: Laying out IO file (1 file / 12288MiB)
1m: Laying out IO file (1 file / 12288MiB)
1m: Laying out IO file (1 file / 12288MiB)
1m: Laying out IO file (1 file / 12288MiB)
1m: Laying out IO file (1 file / 12288MiB)
1m: Laying out IO file (1 file / 12288MiB)
1m: Laying out IO file (1 file / 12288MiB)
1m: Laying out IO file (1 file / 12288MiB)
1m: Laying out IO file (1 file / 12288MiB)
1m: Laying out IO file (1 file / 12288MiB)
1m: Laying out IO file (1 file / 12288MiB)
1m: Laying out IO file (1 file / 12288MiB)
1m: Laying out IO file (1 file / 12288MiB)
1m: Laying out IO file (1 file / 12288MiB)
1m: Laying out IO file (1 file / 12288MiB)
1m: Laying out IO file (1 file / 12288MiB)
1m: Laying out IO file (1 file / 12288MiB)
1m: Laying out IO file (1 file / 12288MiB)
1m: Laying out IO file (1 file / 12288MiB)
1m: Laying out IO file (1 file / 12288MiB)
1m: Laying out IO file (1 file / 12288MiB)
1m: Laying out IO file (1 file / 12288MiB)
1m: Laying out IO file (1 file / 12288MiB)
1m: Laying out IO file (1 file / 12288MiB)
1m: Laying out IO file (1 file / 12288MiB)
1m: Laying out IO file (1 file / 12288MiB)
1m: Laying out IO file (1 file / 12288MiB)
1m: Laying out IO file (1 file / 12288MiB)
1m: Laying out IO file (1 file / 12288MiB)
1m: Laying out IO file (1 file / 12288MiB)
1m: Laying out IO file (1 file / 12288MiB)
Jobs: 11 (f=11): [_(6),W(1),_(4),W(1),_(8),W(1),_(3),W(8)][99.4%][r=0KiB/s,w=1213MiB/s][r=0,w=1213 IOPS][eta 00m:01s]
1m: (groupid=0, jobs=32): err= 0: pid=74782: Mon Jan 25 12:20:32 2021
  write: IOPS=2431, BW=2432MiB/s (2550MB/s)(384GiB/161704msec)
    clat (usec): min=251, max=117777, avg=13006.54, stdev=23856.18
     lat (usec): min=270, max=117787, avg=13027.39, stdev=23851.96
    clat percentiles (usec):
     |  1.00th=[  359],  5.00th=[  371], 10.00th=[  383], 20.00th=[ 408],
     | 30.00th=[  424], 40.00th=[  453], 50.00th=[  537], 60.00th=[ 578],
     | 70.00th=[  619], 80.00th=[55313], 90.00th=[58459], 95.00th=[60556],
     | 99.00th=[63177], 99.50th=[64226], 99.90th=[66323], 99.95th=[68682],
     | 99.99th=[80217]
   bw (  KiB/s): min=55296, max=1054720, per=3.15%, avg=78557.24, stdev=41896.99, samples=10233    iops        : min=   54, max= 1030, avg=76.67, stdev=40.92, samples=10233
  lat (usec)   : 500=46.14%, 750=31.31%, 1000=0.78%
  lat (msec)   : 2=0.03%, 4=0.01%, 10=0.01%, 20=0.01%, 50=0.59%
  lat (msec)   : 100=21.13%, 250=0.01%
  cpu          : usr=0.12%, sys=3.83%, ctx=86515, majf=0, minf=22227
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%      submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%      complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,393216,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
  WRITE: bw=2432MiB/s (2550MB/s), 2432MiB/s-2432MiB/s (2550MB/s-2550MB/s), io=384GiB (412GB), run=161704-161704msec

Disk stats (read/write):
    md11: ios=1/6097731, merge=0/0, ticks=0/0, in_queue=0, util=0.00%, aggrios=23774/2849499, aggrmerge=34232/17493878, aggrticks=28040/18125298, aggrin_queue=18363574, aggrutil=80.03%   nvme2n1: ios=28860/3419298, merge=41127/20992122, ticks=39496/23053174, in_queue=23421586, util=75.76%   nvme3n1: ios=28440/3419396, merge=41081/20992524, ticks=34881/23067448, in_queue=23411872, util=80.03%   nvme4n1: ios=28457/3419413, merge=41361/20992713, ticks=30990/21139316, in_queue=21420720, util=78.03%   nvme5n1: ios=28131/3419446, merge=40331/20992920, ticks=29288/20184431, in_queue=20418749, util=77.00%   nvme6n1: ios=28759/3419446, merge=41496/20992991, ticks=33587/21307424, in_queue=21508518, util=77.04%
  nvme22n1: ios=0/0, merge=0/0, ticks=0/0, in_queue=0, util=0.00%


Thanks,
Rick



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux