On 24/02/26 04:58PM, Luis Chamberlain wrote: > On Mon, Feb 26, 2024 at 1:16 PM John Groves <John@xxxxxxxxxx> wrote: > > > > On 24/02/26 07:53AM, Luis Chamberlain wrote: > > > On Mon, Feb 26, 2024 at 07:27:18AM -0600, John Groves wrote: > > > > Run status group 0 (all jobs): > > > > WRITE: bw=29.6GiB/s (31.8GB/s), 29.6GiB/s-29.6GiB/s (31.8GB/s-31.8GB/s), io=44.7GiB (48.0GB), run=1511-1511msec > > > > > > > This is run on an xfs file system on a SATA ssd. > > > > > > To compare more closer apples to apples, wouldn't it make more sense > > > to try this with XFS on pmem (with fio -direct=1)? > > > > > > Luis > > > > Makes sense. Here is the same command line I used with xfs before, but > > now it's on /dev/pmem0 (the same 128G, but converted from devdax to pmem > > because xfs requires that. > > > > fio -name=ten-256m-per-thread --nrfiles=10 -bs=2M --group_reporting=1 --alloc-size=1048576 --filesize=256MiB --readwrite=write --fallocate=none --numjobs=48 --create_on_open=0 --ioengine=io_uring --direct=1 --directory=/mnt/xfs > > Could you try with mkfs.xfs -d agcount=1024 > > Luis $ luis/fio-xfsdax.sh + sudo mkfs.xfs -d agcount=1024 -m reflink=0 -f /dev/pmem0 meta-data=/dev/pmem0 isize=512 agcount=1024, agsize=32768 blks = sectsz=4096 attr=2, projid32bit=1 = crc=1 finobt=1, sparse=1, rmapbt=0 = reflink=0 bigtime=1 inobtcount=1 nrext64=0 data = bsize=4096 blocks=33554432, imaxpct=25 = sunit=0 swidth=0 blks naming =version 2 bsize=4096 ascii-ci=0, ftype=1 log =internal log bsize=4096 blocks=16384, version=2 = sectsz=4096 sunit=1 blks, lazy-count=1 realtime =none extsz=4096 blocks=0, rtextents=0 + sudo mount -o dax /dev/pmem0 /mnt/xfs + sudo chown jmg:jmg /mnt/xfs + ls -al /mnt/xfs total 0 drwxr-xr-x 2 jmg jmg 6 Feb 26 19:56 . drwxr-xr-x. 4 root root 30 Feb 26 14:58 .. ++ nproc + fio -name=ten-256m-per-thread --nrfiles=10 -bs=2M --group_reporting=1 --alloc-size=1048576 --filesize=256MiB --readwrite=write --fallocate=none --numjobs=48 --create_on_open=0 --ioengine=io_uring --direct=1 --directory=/mnt/xfs ten-256m-per-thread: (g=0): rw=write, bs=(R) 2048KiB-2048KiB, (W) 2048KiB-2048KiB, (T) 2048KiB-2048KiB, ioengine=io_uring, iodepth=1 ... fio-3.33 Starting 48 processes ten-256m-per-thread: Laying out IO files (10 files / total 2441MiB) ten-256m-per-thread: Laying out IO files (10 files / total 2441MiB) ten-256m-per-thread: Laying out IO files (10 files / total 2441MiB) ten-256m-per-thread: Laying out IO files (10 files / total 2441MiB) ten-256m-per-thread: Laying out IO files (10 files / total 2441MiB) ten-256m-per-thread: Laying out IO files (10 files / total 2441MiB) ten-256m-per-thread: Laying out IO files (10 files / total 2441MiB) ten-256m-per-thread: Laying out IO files (10 files / total 2441MiB) ten-256m-per-thread: Laying out IO files (10 files / total 2441MiB) ten-256m-per-thread: Laying out IO files (10 files / total 2441MiB) ten-256m-per-thread: Laying out IO files (10 files / total 2441MiB) ten-256m-per-thread: Laying out IO files (10 files / total 2441MiB) ten-256m-per-thread: Laying out IO files (10 files / total 2441MiB) ten-256m-per-thread: Laying out IO files (10 files / total 2441MiB) ten-256m-per-thread: Laying out IO files (10 files / total 2441MiB) ten-256m-per-thread: Laying out IO files (10 files / total 2441MiB) ten-256m-per-thread: Laying out IO files (10 files / total 2441MiB) ten-256m-per-thread: Laying out IO files (10 files / total 2441MiB) ten-256m-per-thread: Laying out IO files (10 files / total 2441MiB) ten-256m-per-thread: Laying out IO files (10 files / total 2441MiB) ten-256m-per-thread: Laying out IO files (10 files / total 2441MiB) ten-256m-per-thread: Laying out IO files (10 files / total 2441MiB) ten-256m-per-thread: Laying out IO files (10 files / total 2441MiB) ten-256m-per-thread: Laying out IO files (10 files / total 2441MiB) ten-256m-per-thread: Laying out IO files (10 files / total 2441MiB) ten-256m-per-thread: Laying out IO files (10 files / total 2441MiB) ten-256m-per-thread: Laying out IO files (10 files / total 2441MiB) ten-256m-per-thread: Laying out IO files (10 files / total 2441MiB) ten-256m-per-thread: Laying out IO files (10 files / total 2441MiB) ten-256m-per-thread: Laying out IO files (10 files / total 2441MiB) ten-256m-per-thread: Laying out IO files (10 files / total 2441MiB) ten-256m-per-thread: Laying out IO files (10 files / total 2441MiB) ten-256m-per-thread: Laying out IO files (10 files / total 2441MiB) ten-256m-per-thread: Laying out IO files (10 files / total 2441MiB) ten-256m-per-thread: Laying out IO files (10 files / total 2441MiB) ten-256m-per-thread: Laying out IO files (10 files / total 2441MiB) ten-256m-per-thread: Laying out IO files (10 files / total 2441MiB) ten-256m-per-thread: Laying out IO files (10 files / total 2441MiB) ten-256m-per-thread: Laying out IO files (10 files / total 2441MiB) ten-256m-per-thread: Laying out IO files (10 files / total 2441MiB) ten-256m-per-thread: Laying out IO files (10 files / total 2441MiB) ten-256m-per-thread: Laying out IO files (10 files / total 2441MiB) ten-256m-per-thread: Laying out IO files (10 files / total 2441MiB) ten-256m-per-thread: Laying out IO files (10 files / total 2441MiB) ten-256m-per-thread: Laying out IO files (10 files / total 2441MiB) ten-256m-per-thread: Laying out IO files (10 files / total 2441MiB) ten-256m-per-thread: Laying out IO files (10 files / total 2441MiB) ten-256m-per-thread: Laying out IO files (10 files / total 2441MiB) Jobs: 17 (f=170): [_(2),W(1),_(8),W(2),_(7),W(3),_(2),W(2),_(3),W(2),_(2),W(1),_(2),W(1),_(1),W(3),_(4),W(2)][Jobs: 1 (f=10): [_(47),W(1)][100.0%][w=8022MiB/s][w=4011 IOPS][eta 00m:00s] ten-256m-per-thread: (groupid=0, jobs=48): err= 0: pid=141563: Mon Feb 26 19:56:28 2024 write: IOPS=6578, BW=12.8GiB/s (13.8GB/s)(114GiB/8902msec); 0 zone resets slat (usec): min=18, max=60593, avg=1230.85, stdev=1799.97 clat (usec): min=2, max=98969, avg=5133.25, stdev=5141.07 lat (usec): min=294, max=99725, avg=6364.09, stdev=5440.30 clat percentiles (usec): | 1.00th=[ 11], 5.00th=[ 46], 10.00th=[ 217], 20.00th=[ 2376], | 30.00th=[ 2999], 40.00th=[ 3556], 50.00th=[ 3785], 60.00th=[ 3982], | 70.00th=[ 4228], 80.00th=[ 7504], 90.00th=[13173], 95.00th=[14091], | 99.00th=[21890], 99.50th=[27919], 99.90th=[45351], 99.95th=[57934], | 99.99th=[82314] bw ( MiB/s): min= 5085, max=27367, per=100.00%, avg=14361.95, stdev=165.61, samples=719 iops : min= 2516, max=13670, avg=7160.17, stdev=82.88, samples=719 lat (usec) : 4=0.05%, 10=0.72%, 20=2.23%, 50=2.48%, 100=3.02% lat (usec) : 250=1.54%, 500=2.37%, 750=1.34%, 1000=0.75% lat (msec) : 2=3.20%, 4=43.10%, 10=23.05%, 20=14.81%, 50=1.25% lat (msec) : 100=0.08% cpu : usr=10.18%, sys=0.79%, ctx=67227, majf=0, minf=38511 IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued rwts: total=0,58560,0,0 short=0,0,0,0 dropped=0,0,0,0 latency : target=0, window=0, percentile=100.00%, depth=1 Run status group 0 (all jobs): WRITE: bw=12.8GiB/s (13.8GB/s), 12.8GiB/s-12.8GiB/s (13.8GB/s-13.8GB/s), io=114GiB (123GB), run=8902-8902msec Disk stats (read/write): pmem0: ios=0/0, merge=0/0, ticks=0/0, in_queue=0, util=0.00% I ran it several times with similar results. Regards, John