Sorry for the delay here. Please see below inline. > On Mar 17, 2018, at 1:42 AM, Sitsofe Wheeler <sitsofe@xxxxxxxxx> wrote: > > Could you repeat the problem on a recent version of fio (see > https://github.com/axboe/fio/releases for what we're up to)? Sure. Here are the results with latest fio. With LFSR [root@sm28 fio-master]# fio --name=global --thread=1 --direct=1 --group_reporting=1 --iomem_align=4k --name=PT7 --rw=randrw --rwmixread=100 --iodepth=40 --numjobs=8 --bs=4096 --size=450GiB --runtime=120 --filename='/dev/e8b0:/dev/e8b1:/dev/e8b2:/dev/e8b3:/dev/e8b4:/dev/e8b5:/dev/e8b6:/dev/e8b7:/dev/e8b8:/dev/e8b9:/dev/e8b10:/dev/e8b11:/dev/e8b12:/dev/e8b13:/dev/e8b14:/dev/e8b15' --ioengine=libaio --numa_cpu_nodes=0 --random_generator=lfsr PT7: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=40 ... fio-3.5-80-gb348 Starting 8 threads Jobs: 8 (f=128): [r(8)][100.0%][r=3532MiB/s,w=0KiB/s][r=904k,w=0 IOPS][eta 00m:00s] PT7: (groupid=0, jobs=8): err= 0: pid=28376: Thu Mar 29 07:35:44 2018 read: IOPS=895k, BW=3496MiB/s (3666MB/s)(410GiB/120002msec) slat (nsec): min=1619, max=964358, avg=3682.85, stdev=4806.21 clat (usec): min=25, max=13002, avg=353.13, stdev=192.55 lat (usec): min=35, max=13007, avg=356.96, stdev=192.42 clat percentiles (usec): | 1.00th=[ 91], 5.00th=[ 141], 10.00th=[ 169], 20.00th=[ 206], | 30.00th=[ 239], 40.00th=[ 269], 50.00th=[ 306], 60.00th=[ 347], | 70.00th=[ 400], 80.00th=[ 474], 90.00th=[ 603], 95.00th=[ 742], | 99.00th=[ 1020], 99.50th=[ 1123], 99.90th=[ 1352], 99.95th=[ 1434], | 99.99th=[ 1647] bw ( KiB/s): min=373440, max=481952, per=12.50%, avg=447480.00, stdev=12566.36, samples=1918 iops : min=93360, max=120488, avg=111869.99, stdev=3141.58, samples=1918 lat (usec) : 50=0.02%, 100=1.45%, 250=32.29%, 500=49.13%, 750=12.35% lat (usec) : 1000=3.60% lat (msec) : 2=1.15%, 4=0.01%, 10=0.01%, 20=0.01% cpu : usr=17.66%, sys=49.66%, ctx=9942380, majf=0, minf=4898 IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=100.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0% issued rwts: total=107404316,0,0,0 short=0,0,0,0 dropped=0,0,0,0 latency : target=0, window=0, percentile=100.00%, depth=40 Run status group 0 (all jobs): READ: bw=3496MiB/s (3666MB/s), 3496MiB/s-3496MiB/s (3666MB/s-3666MB/s), io=410GiB (440GB), run=120002-120002msec Disk stats (read/write): e8b0: ios=6710489/0, merge=0/0, ticks=1818366/0, in_queue=1822176, util=99.74% e8b1: ios=6710487/0, merge=0/0, ticks=1993186/0, in_queue=1995781, util=99.75% e8b2: ios=6710490/0, merge=0/0, ticks=2076054/0, in_queue=2080427, util=99.76% e8b3: ios=6710491/0, merge=0/0, ticks=2101963/0, in_queue=2107744, util=99.80% e8b4: ios=6710493/0, merge=0/0, ticks=2167111/0, in_queue=2169552, util=99.79% e8b5: ios=6710496/0, merge=0/0, ticks=2149837/0, in_queue=2153109, util=99.85% e8b6: ios=6710497/0, merge=0/0, ticks=1966688/0, in_queue=1970940, util=99.85% e8b7: ios=6710496/0, merge=0/0, ticks=1984307/0, in_queue=1989317, util=99.87% e8b8: ios=6710497/0, merge=0/0, ticks=1985081/0, in_queue=1989662, util=99.88% e8b9: ios=6710498/0, merge=0/0, ticks=1995815/0, in_queue=2000669, util=99.92% e8b10: ios=6710498/0, merge=0/0, ticks=2005176/0, in_queue=2009368, util=99.94% e8b11: ios=6710498/0, merge=0/0, ticks=2022758/0, in_queue=2027682, util=99.99% e8b12: ios=6710499/0, merge=0/0, ticks=1996747/0, in_queue=2001118, util=100.00% e8b13: ios=6710502/0, merge=0/0, ticks=2034211/0, in_queue=2039490, util=100.00% e8b14: ios=6710502/0, merge=0/0, ticks=2035394/0, in_queue=2040469, util=100.00% e8b15: ios=6710505/0, merge=0/0, ticks=2010598/0, in_queue=2017014, util=100.00% Without LFSR [root@sm28 fio-master]# fio --name=global --thread=1 --direct=1 --group_reporting=1 --iomem_align=4k --name=PT7 --rw=randrw --rwmixread=100 --iodepth=40 --numjobs=8 --bs=4096 --size=450GiB --runtime=120 --filename='/dev/e8b0:/dev/e8b1:/dev/e8b2:/dev/e8b3:/dev/e8b4:/dev/e8b5:/dev/e8b6:/dev/e8b7:/dev/e8b8:/dev/e8b9:/dev/e8b10:/dev/e8b11:/dev/e8b12:/dev/e8b13:/dev/e8b14:/dev/e8b15' --ioengine=libaio --numa_cpu_nodes=0 PT7: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=40 ... fio-3.5-80-gb348 Starting 8 threads Jobs: 8 (f=128): [r(8)][100.0%][r=3729MiB/s,w=0KiB/s][r=955k,w=0 IOPS][eta 00m:00s] PT7: (groupid=0, jobs=8): err= 0: pid=28564: Thu Mar 29 07:40:09 2018 read: IOPS=943k, BW=3684MiB/s (3863MB/s)(432GiB/120007msec) slat (nsec): min=1656, max=1601.7k, avg=4342.55, stdev=6792.60 clat (usec): min=31, max=12646, avg=333.84, stdev=102.52 lat (usec): min=38, max=12648, avg=338.34, stdev=101.86 clat percentiles (usec): | 1.00th=[ 120], 5.00th=[ 167], 10.00th=[ 204], 20.00th=[ 249], | 30.00th=[ 281], 40.00th=[ 310], 50.00th=[ 334], 60.00th=[ 363], | 70.00th=[ 392], 80.00th=[ 420], 90.00th=[ 457], 95.00th=[ 482], | 99.00th=[ 545], 99.50th=[ 594], 99.90th=[ 865], 99.95th=[ 955], | 99.99th=[ 1254] bw ( KiB/s): min=390520, max=518192, per=12.50%, avg=471560.73, stdev=13752.55, samples=1914 iops : min=97630, max=129548, avg=117890.14, stdev=3438.13, samples=1914 lat (usec) : 50=0.01%, 100=0.37%, 250=20.22%, 500=76.31%, 750=2.87% lat (usec) : 1000=0.20% lat (msec) : 2=0.03%, 4=0.01%, 10=0.01%, 20=0.01% cpu : usr=23.47%, sys=59.78%, ctx=10930907, majf=0, minf=299365 IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=100.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0% issued rwts: total=113187098,0,0,0 short=0,0,0,0 dropped=0,0,0,0 latency : target=0, window=0, percentile=100.00%, depth=40 Run status group 0 (all jobs): READ: bw=3684MiB/s (3863MB/s), 3684MiB/s-3684MiB/s (3863MB/s-3863MB/s), io=432GiB (464GB), run=120007-120007msec Disk stats (read/write): e8b0: ios=7070562/0, merge=0/0, ticks=1862741/0, in_queue=1868144, util=99.78% e8b1: ios=7070570/0, merge=0/0, ticks=1996123/0, in_queue=1999170, util=99.79% e8b2: ios=7070580/0, merge=0/0, ticks=2019351/0, in_queue=2024920, util=99.78% e8b3: ios=7070581/0, merge=0/0, ticks=2018430/0, in_queue=2024167, util=99.80% e8b4: ios=7070585/0, merge=0/0, ticks=2069985/0, in_queue=2072530, util=99.78% e8b5: ios=7070586/0, merge=0/0, ticks=2032085/0, in_queue=2035097, util=99.81% e8b6: ios=7070586/0, merge=0/0, ticks=1838167/0, in_queue=1842676, util=99.79% e8b7: ios=7070589/0, merge=0/0, ticks=1838162/0, in_queue=1843175, util=99.83% e8b8: ios=7070587/0, merge=0/0, ticks=1837259/0, in_queue=1842283, util=99.86% e8b9: ios=7070595/0, merge=0/0, ticks=1836983/0, in_queue=1841398, util=99.89% e8b10: ios=7070601/0, merge=0/0, ticks=1835967/0, in_queue=1840679, util=99.91% e8b11: ios=7070605/0, merge=0/0, ticks=1835374/0, in_queue=1840121, util=99.94% e8b12: ios=7070609/0, merge=0/0, ticks=1834964/0, in_queue=1839546, util=99.98% e8b13: ios=7070609/0, merge=0/0, ticks=1835191/0, in_queue=1839604, util=100.00% e8b14: ios=7070608/0, merge=0/0, ticks=1835130/0, in_queue=1840084, util=100.00% e8b15: ios=7070611/0, merge=0/0, ticks=1835606/0, in_queue=1842956, util=100.00% Initially, I thought that there is a clear improvement here as the latency gap is smaller now. But then I decided to follow your advice of stripping the command down to the bare minimum. > It would also > help if you strip the line you are using down to the bare minimum that > still shows the problem (e.g. if you can remove numa, lock it to CPUs > make it happen on a pure randread workload etc). Now when it comes to removing the flags, I didn’t want to remove the --numa flag because in E8 architecture that may adversely affect latency standard deviation. Reason being that FIO may occasionally get scheduled to run on the same core where where e8 driver is running. So the two flags that I could get rid of were --iomem_align and --size. After experimenting a bit, I eventually set out to run a series of tests with every permutation of the two flags, including a test without them. For every permutation, the test was repeated 3 times and latencies recorded. Nothing else was running on either the storage controller or the host during the testing. base command was fio --name=global --thread=1 --direct=1 --group_reporting=1 --name=PT7 --rw=randrw --rwmixread=100 --iodepth=40 --numjobs=8 --bs=4096 --runtime=120 --filename='/dev/e8b0:/dev/e8b1:/dev/e8b2:/dev/e8b3:/dev/e8b4:/dev/e8b5:/dev/e8b6:/dev/e8b7:/dev/e8b8:/dev/e8b9:/dev/e 8b10:/dev/e8b11:/dev/e8b12:/dev/e8b13:/dev/e8b14:/dev/e8b15' --ioengine=libaio --random_generator=lfsr and fio --name=global --thread=1 --direct=1 --group_reporting=1 --name=PT7 --rw=randrw --rwmixread=100 --iodepth=40 --numjobs=8 --bs=4096 --runtime=120 --filename='/dev/e8b0:/dev/e8b1:/dev/e8b2:/dev/e8b3:/dev/e8b4:/dev/e8b5:/dev/e8b6:/dev/e8b7:/dev/e8b8:/dev/e8b9:/dev/e 8b10:/dev/e8b11:/dev/e8b12:/dev/e8b13:/dev/e8b14:/dev/e8b15' --ioengine=libaio The only difference is --random_generator=lfsr or not. The iomem and size flags were iomem_align=4k and size=450GiB Here are the latency numbers I’ve got: LFSR + iomem + size = 348.07, 347.68, 347.59 nothing = 344.28, 344.84, 345.04 + size=349.28, 348.37, 348.37 + iomem = 346.81, 346.05, 344.74 NO LFSR + iomem + size = 344.03, 344.29, 345.47 nothing = 347.05, 346.00, 346.03 + size = 345.43, 343.55, 343.98 + iomem = 347.87, 347.02, 347.84 It appears that with both flags, LFSR is still behind NO LFSR even so minimally. When both flags are omitted the picture reverses. Looking at the overall picture, I cannot identify clear winner here. It’s almost like the differences are within measurement error. > If it > happens there we could do with the output from Linux's perf Not sure what “Linux’s perf” exactly means. Michael -- To unsubscribe from this list: send the line "unsubscribe fio" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html