Hi Jeff,
I have done the test you suggested, and this is what came up:
read_throughput_rpool: (g=0): rw=read, bs=(R) 1024KiB-1024KiB, (W)
1024KiB-1024KiB, (T) 1024KiB-1024KiB, ioengine=psync, iodepth=1
fio-3.33
Starting 1 process
read_throughput_rpool: Laying out IO file (1 file / 1024MiB)
Jobs: 1 (f=1): [R(1)][100.0%][r=904MiB/s][r=903 IOPS][eta 00m:00s]
read_throughput_rpool: (groupid=0, jobs=1): err= 0: pid=3284411: Fri
Apr
5 07:31:18 2024
read: IOPS=1092, BW=1092MiB/s (1145MB/s)(64.0GiB/60001msec)
clat (usec): min=733, max=6031, avg=903.71, stdev=166.68
lat (usec): min=734, max=6032, avg=904.88, stdev=166.84
clat percentiles (usec):
| 1.00th=[ 742], 5.00th=[ 750], 10.00th=[ 750], 20.00th=[
758],
| 30.00th=[ 766], 40.00th=[ 791], 50.00th=[ 824], 60.00th=[
873],
| 70.00th=[ 1037], 80.00th=[ 1106], 90.00th=[ 1139], 95.00th=[
1188],
| 99.00th=[ 1254], 99.50th=[ 1287], 99.90th=[ 1401], 99.95th=[
1500],
| 99.99th=[ 2474]
bw ( MiB/s): min= 871, max= 1270, per=100.00%, avg=1092.97,
stdev=166.88, samples=120
iops : min= 871, max= 1270, avg=1092.78, stdev=166.95,
samples=120
lat (usec) : 750=6.17%, 1000=60.30%
lat (msec) : 2=33.51%, 4=0.02%, 10=0.01%
cpu : usr=2.04%, sys=97.52%, ctx=2457, majf=0, minf=37
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%,
>=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
>=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
>=64=0.0%
issued rwts: total=65543,0,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=1
Run status group 0 (all jobs):
READ: bw=1092MiB/s (1145MB/s), 1092MiB/s-1092MiB/s
(1145MB/s-1145MB/s), io=64.0GiB (68.7GB), run=60001-60001msec
I take that the file was being cached (because the 1145 MBps on a
single
SSD disk is hard to believe)? For the write test, these are the
results:
write_throughput_rpool: (g=0): rw=write, bs=(R) 1024KiB-1024KiB, (W)
1024KiB-1024KiB, (T) 1024KiB-1024KiB, ioengine=psync, iodepth=1
fio-3.33
Starting 1 process
write_throughput_rpool: Laying out IO file (1 file / 1024MiB)
Jobs: 1 (f=1): [W(1)][100.0%][w=4100KiB/s][w=4 IOPS][eta 00m:00s]
write_throughput_rpool: (groupid=0, jobs=1): err= 0: pid=3292023: Fri
Apr 5 07:36:11 2024
write: IOPS=5, BW=5394KiB/s (5524kB/s)(317MiB/60174msec); 0 zone
resets
clat (msec): min=16, max=1063, avg=189.59, stdev=137.19
lat (msec): min=16, max=1063, avg=189.81, stdev=137.19
clat percentiles (msec):
| 1.00th=[ 18], 5.00th=[ 22], 10.00th=[ 41], 20.00th=[
62],
| 30.00th=[ 129], 40.00th=[ 144], 50.00th=[ 163], 60.00th=[
190],
| 70.00th=[ 228], 80.00th=[ 268], 90.00th=[ 363], 95.00th=[
489],
| 99.00th=[ 567], 99.50th=[ 592], 99.90th=[ 1062], 99.95th=[
1062],
| 99.99th=[ 1062]
bw ( KiB/s): min= 2048, max=41042, per=100.00%, avg=5534.14,
stdev=5269.61, samples=117
iops : min= 2, max= 40, avg= 5.40, stdev= 5.14,
samples=117
lat (msec) : 20=3.47%, 50=11.99%, 100=8.83%, 250=51.42%,
500=20.19%
lat (msec) : 750=3.79%, 2000=0.32%
cpu : usr=0.11%, sys=0.52%, ctx=2673, majf=0, minf=38
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%,
>=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
>=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
>=64=0.0%
issued rwts: total=0,317,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=1
Run status group 0 (all jobs):
WRITE: bw=5394KiB/s (5524kB/s), 5394KiB/s-5394KiB/s
(5524kB/s-5524kB/s), io=317MiB (332MB), run=60174-60174msec
Still at 6 MBps. Should move this discussion to the openzfs team, or
is
there anything else still worth testing?
Thank you very much!
---
Felix Rubio
"Don't believe what you're told. Double check."
On 2024-04-04 21:07, Jeff Johnson wrote:
> Felix,
>
> Based on your previous emails about the drive it would appear that the
> hardware (ssd, cables, port) are fine and the drive performs.
>
> Go back and run your original ZFS test on your mounted ZFS volume
> directory and remove the "--direct=1" from your command as ZFS does
> not yet support direct_io and disabling buffered io to the zfs
> directory will have very negative impacts. This is a ZFS thing, not
> your kernel or hardware.
>
> --Jeff
>
> On Thu, Apr 4, 2024 at 12:00 PM Felix Rubio <felix@xxxxxxxxx> wrote:
>>
>> hey Jeff,
>>
>> Good catch! I have run the following command:
>>
>> fio --name=seqread --numjobs=1 --time_based --runtime=60s
>> --ramp_time=2s
>> --iodepth=8 --ioengine=libaio --direct=1 --verify=0
>> --group_reporting=1
>> --bs=1M --rw=read --size=1G --filename=/dev/sda
>>
>> (/sda and /sdd are the drives I have on one of the pools), and this is
>> what I get:
>>
>> seqread: (g=0): rw=read, bs=(R) 1024KiB-1024KiB, (W) 1024KiB-1024KiB,
>> (T) 1024KiB-1024KiB, ioengine=libaio, iodepth=8
>> fio-3.33
>> Starting 1 process
>> Jobs: 1 (f=1): [R(1)][100.0%][r=388MiB/s][r=388 IOPS][eta 00m:00s]
>> seqread: (groupid=0, jobs=1): err= 0: pid=2368687: Thu Apr 4 20:56:06
>> 2024
>> read: IOPS=382, BW=383MiB/s (401MB/s)(22.4GiB/60020msec)
>> slat (usec): min=17, max=3098, avg=68.94, stdev=46.04
>> clat (msec): min=14, max=367, avg=20.84, stdev= 6.61
>> lat (msec): min=15, max=367, avg=20.91, stdev= 6.61
>> clat percentiles (msec):
>> | 1.00th=[ 21], 5.00th=[ 21], 10.00th=[ 21], 20.00th=[
>> 21],
>> | 30.00th=[ 21], 40.00th=[ 21], 50.00th=[ 21], 60.00th=[
>> 21],
>> | 70.00th=[ 21], 80.00th=[ 21], 90.00th=[ 21], 95.00th=[
>> 21],
>> | 99.00th=[ 25], 99.50th=[ 31], 99.90th=[ 48], 99.95th=[
>> 50],
>> | 99.99th=[ 368]
>> bw ( KiB/s): min=215040, max=399360, per=100.00%, avg=392047.06,
>> stdev=19902.89, samples=120
>> iops : min= 210, max= 390, avg=382.55, stdev=19.43,
>> samples=120
>> lat (msec) : 20=0.19%, 50=99.80%, 100=0.01%, 500=0.03%
>> cpu : usr=0.39%, sys=1.93%, ctx=45947, majf=0, minf=37
>> IO depths : 1=0.0%, 2=0.0%, 4=0.0%, 8=100.0%, 16=0.0%, 32=0.0%,
>> >=64=0.0%
>> submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
>> >=64=0.0%
>> complete : 0=0.0%, 4=100.0%, 8=0.1%, 16=0.0%, 32=0.0%, 64=0.0%,
>> >=64=0.0%
>> issued rwts: total=22954,0,0,0 short=0,0,0,0 dropped=0,0,0,0
>> latency : target=0, window=0, percentile=100.00%, depth=8
>>
>> Run status group 0 (all jobs):
>> READ: bw=383MiB/s (401MB/s), 383MiB/s-383MiB/s (401MB/s-401MB/s),
>> io=22.4GiB (24.1GB), run=60020-60020msec
>>
>> Disk stats (read/write):
>> sda: ios=23817/315, merge=0/0, ticks=549704/132687,
>> in_queue=683613,
>> util=99.93%
>>
>> 400 MBps!!! This is a number I have never experienced. I understand
>> this
>> means I need to go back to the openzfs chat/forum?
>>
>> ---
>> Felix Rubio
>> "Don't believe what you're told. Double check."
>>
>
>
> --
> ------------------------------
> Jeff Johnson
> Co-Founder
> Aeon Computing
>
> jeff.johnson@xxxxxxxxxxxxxxxxx
> www.aeoncomputing.com
> t: 858-412-3810 x1001 f: 858-412-3845
> m: 619-204-9061
>
> 4170 Morena Boulevard, Suite C - San Diego, CA 92117
>
> High-Performance Computing / Lustre Filesystems / Scale-out Storage