data verification error with ioengine=libaio

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

I'm consistently hitting a data verification error for random reads when using the libaio ioengine. Am I hitting a bug in fio or in the linux storage stack (aio/block/scsi)? Maybe an underlying hardware issue (SAS backplane)? Any recommendations on digging further into this issue? It seems to only occur with random reads of smaller block sizes (~4k). I have not seen the problem with sequential reads, and running the same test with 512b and 64kb blocks did not produce the verification error. Similarly, using the sg ioengine instead passed all tests, however I could not achieve as high iops as aio (segfaults and crashes when iodepth > 2). High iops could be a contributor to the problem.

There are no problems with the drive; SMART data is clean and as I mentioned sequential reads are ok and random reads with sg passed. I'm going to try lowering the iodepth value (to something like 4), but that will certainly hurt iops for random reads.

Any input is appreciated.

Here's my jobfile (similar to the suface-scan.fio example):

[global]
filename=/dev/sdf
direct=1
size=10g
ba=512
verify=meta
verify_pattern=0xdeadbeef
verify_interval=512
verify_fatal=1
ioengine=libaio
iodepth=64
iodepth_batch_complete=0

[write-phase]
bs=1M
rw=write
do_verify=0

[seq-verify-phase]
stonewall
create_serialize=0
bs=4k
rw=read
do_verify=1

[rand-read]
stonewall
create_serialize=0
bs=4k
rw=randread
do_verify=1

And here's the program output:

write-phase: (g=0): rw=write, bs=1M-1M/1M-1M/1M-1M, ioengine=libaio, iodepth=64
seq-verify-phase: (g=1): rw=read, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=64
fio: Any use of blockalign= turns off randommap
rand-read: (g=2): rw=randread, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=64
fio-2.1.2-21-gf7133
Starting 3 processes
verify: bad magic header 61b3, wanted acca at file /dev/sdf offset 14562280616, length 1773542936
fio: pid=22326, err=84/file:io_u.c:1613, func=io_u_queued_complete, error=Invalid or incomplete multibyte or wide character

write-phase: (groupid=0, jobs=1): err= 0: pid=22288: Wed Sep 11 11:44:30 2013
  write: io=10240MB, bw=158194KB/s, iops=154, runt= 66284msec
    slat (usec): min=31, max=115, avg=82.11, stdev= 8.04
    clat (msec): min=47, max=845, avg=414.02, stdev=27.16
     lat (msec): min=47, max=845, avg=414.10, stdev=27.16
    clat percentiles (msec):
     |  1.00th=[  396],  5.00th=[  396], 10.00th=[  396], 20.00th=[  404],
     | 30.00th=[  408], 40.00th=[  412], 50.00th=[  416], 60.00th=[  420],
     | 70.00th=[  420], 80.00th=[  420], 90.00th=[  424], 95.00th=[  429],
     | 99.00th=[  429], 99.50th=[  529], 99.90th=[  783], 99.95th=[  816],
     | 99.99th=[  840]
    bw (KB  /s): min=72868, max=165556, per=99.73%, avg=157766.60, stdev=8348.98
    lat (msec) : 50=0.01%, 100=0.05%, 250=0.23%, 500=99.16%, 750=0.39%
    lat (msec) : 1000=0.16%
  cpu          : usr=24.48%, sys=75.37%, ctx=6700, majf=0, minf=28
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.2%, 32=0.3%, >=64=99.4%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.6%, 4=99.4%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
     issued    : total=r=0/w=10240/d=0, short=r=0/w=0/d=0
seq-verify-phase: (groupid=1, jobs=1): err= 0: pid=22297: Wed Sep 11 11:44:30 2013
  read : io=10240MB, bw=52693KB/s, iops=13173, runt=198997msec
    slat (usec): min=2, max=34, avg= 3.45, stdev= 0.63
    clat (usec): min=4004, max=10817, avg=4850.99, stdev=213.17
     lat (usec): min=4008, max=10828, avg=4854.55, stdev=213.16
    clat percentiles (usec):
     |  1.00th=[ 4448],  5.00th=[ 4512], 10.00th=[ 4576], 20.00th=[ 4704],
     | 30.00th=[ 4768], 40.00th=[ 4768], 50.00th=[ 4832], 60.00th=[ 4896],
     | 70.00th=[ 4960], 80.00th=[ 5024], 90.00th=[ 5152], 95.00th=[ 5216],
     | 99.00th=[ 5344], 99.50th=[ 5408], 99.90th=[ 5472], 99.95th=[ 5472],
     | 99.99th=[ 5536]
    bw (KB  /s): min=52064, max=53600, per=100.00%, avg=52697.87, stdev=230.19
    lat (msec) : 10=100.00%, 20=0.01%
  cpu          : usr=27.17%, sys=72.69%, ctx=20115, majf=0, minf=83
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.1%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
     issued    : total=r=2621440/w=0/d=0, short=r=0/w=0/d=0
rand-read: (groupid=2, jobs=1): err=84 (file:io_u.c:1613, func=io_u_queued_complete, error=Invalid or incomplete multibyte or wide character): pid=22326: Wed Sep 11 11:44:30 2013
  read : io=8633.5MB, bw=1595.1KB/s, iops=398, runt=5539409msec
    slat (usec): min=2, max=41, avg= 4.20, stdev= 0.97
    clat (msec): min=5, max=2220, avg=160.40, stdev=90.77
     lat (msec): min=5, max=2220, avg=160.40, stdev=90.77
    clat percentiles (msec):
     |  1.00th=[   81],  5.00th=[   86], 10.00th=[   90], 20.00th=[   97],
     | 30.00th=[  105], 40.00th=[  116], 50.00th=[  129], 60.00th=[  147],
     | 70.00th=[  172], 80.00th=[  208], 90.00th=[  273], 95.00th=[  338],
     | 99.00th=[  510], 99.50th=[  586], 99.90th=[  775], 99.95th=[  857],
     | 99.99th=[ 1057]
    bw (KB  /s): min=  922, max= 1764, per=100.00%, avg=1597.01, stdev=55.23
    lat (msec) : 10=0.01%, 20=0.01%, 50=0.01%, 100=24.34%, 250=63.06%
    lat (msec) : 500=11.51%, 750=0.97%, 1000=0.11%, 2000=0.02%, >=2000=0.01%
  cpu          : usr=24.33%, sys=75.53%, ctx=559965, majf=0, minf=114
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.1%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
     issued    : total=r=2210157/w=0/d=0, short=r=0/w=0/d=0

Run status group 0 (all jobs):
  WRITE: io=10240MB, aggrb=158194KB/s, minb=158194KB/s, maxb=158194KB/s, mint=66284msec, maxt=66284msec

Run status group 1 (all jobs):
   READ: io=10240MB, aggrb=52693KB/s, minb=52693KB/s, maxb=52693KB/s, mint=198997msec, maxt=198997msec

Run status group 2 (all jobs):
   READ: io=8633.5MB, aggrb=1595KB/s, minb=1595KB/s, maxb=1595KB/s, mint=5539409msec, maxt=5539409msec

Disk stats (read/write):
  sdf: ios=4831582/10240, merge=0/10240, ticks=367243127/4226709, in_queue=371473715, util=100.00%


Here's the same test using ioengine=sg.

Job file:

[global]
filename=/dev/sg6        
size=10g 
verify=meta
verify_pattern=0xdeadbeef
verify_interval=512
verify_fatal=1
ioengine=sg
iodepth=2
iodepth_batch_complete=0
                         
[write-phase] 
bs=1M 
rw=write 
do_verify=0 
                         
[seq-verify-phase] 
stonewall 
create_serialize=0 
bs=4k 
rw=read 
do_verify=1 
                         
[rand-read] 
stonewall 
create_serialize=0 
bs=4k 
rw=randread 
do_verify=1 

Fio output:

write-phase: (g=0): rw=write, bs=1M-1M/1M-1M/1M-1M, ioengine=sg, iodepth=2
seq-verify-phase: (g=1): rw=read, bs=4K-4K/4K-4K/4K-4K, ioengine=sg, iodepth=2
rand-read: (g=2): rw=randread, bs=4K-4K/4K-4K/4K-4K, ioengine=sg, iodepth=2
fio-2.1.2-21-gf7133
Starting 3 processes
Jobs: 1 (f=1): [__V] [100.0% done] [5824KB/0KB/0KB /s] [1456/0/0 iops] [eta 00m:01s]
write-phase: (groupid=0, jobs=1): err= 0: pid=14522: Wed Sep 11 01:56:39 2013
  write: io=10240MB, bw=158374KB/s, iops=154, runt= 66209msec
    slat (usec): min=0, max=34, avg= 0.75, stdev= 0.82
    clat (msec): min=11, max=69, avg=12.72, stdev= 1.28
     lat (msec): min=11, max=69, avg=12.72, stdev= 1.28
    clat percentiles (usec):
     |  1.00th=[11840],  5.00th=[11840], 10.00th=[11840], 20.00th=[12096],
     | 30.00th=[12096], 40.00th=[12480], 50.00th=[12608], 60.00th=[12608],
     | 70.00th=[12864], 80.00th=[13120], 90.00th=[13376], 95.00th=[13376],
     | 99.00th=[20608], 99.50th=[21376], 99.90th=[21632], 99.95th=[21632],
     | 99.99th=[46848]
    bw (KB  /s): min=136126, max=165556, per=100.00%, avg=158540.06, stdev=3881.00
    lat (msec) : 20=98.70%, 50=1.29%, 100=0.01%
  cpu          : usr=3.34%, sys=4.29%, ctx=14438, majf=0, minf=28
  IO depths    : 1=0.1%, 2=100.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.1%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued    : total=r=0/w=10240/d=0, short=r=0/w=0/d=0
seq-verify-phase: (groupid=1, jobs=1): err= 0: pid=14530: Wed Sep 11 01:56:39 2013
  read : io=10240MB, bw=46730KB/s, iops=11682, runt=224389msec
    slat (usec): min=0, max=22, avg= 0.21, stdev= 0.43
    clat (usec): min=111, max=9616, avg=160.47, stdev=28.63
     lat (usec): min=111, max=9616, avg=160.68, stdev=28.63
    clat percentiles (usec):
     |  1.00th=[  123],  5.00th=[  131], 10.00th=[  131], 20.00th=[  139],
     | 30.00th=[  157], 40.00th=[  159], 50.00th=[  159], 60.00th=[  161],
     | 70.00th=[  165], 80.00th=[  175], 90.00th=[  181], 95.00th=[  195],
     | 99.00th=[  278], 99.50th=[  334], 99.90th=[  378], 99.95th=[  386],
     | 99.99th=[  426]
    bw (KB  /s): min=45560, max=47360, per=100.00%, avg=46734.54, stdev=271.85
    lat (usec) : 250=98.40%, 500=1.60%, 750=0.01%, 1000=0.01%
    lat (msec) : 2=0.01%, 10=0.01%
  cpu          : usr=16.10%, sys=17.18%, ctx=2639195, majf=0, minf=28
  IO depths    : 1=0.1%, 2=100.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.1%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued    : total=r=2621440/w=0/d=0, short=r=0/w=0/d=0
rand-read: (groupid=2, jobs=1): err= 0: pid=14617: Wed Sep 11 01:56:39 2013
  read : io=10240MB, bw=782641B/s, iops=191, runt=13719455msec
    slat (usec): min=0, max=456, avg= 0.36, stdev= 0.67
    clat (usec): min=151, max=155751, avg=10453.37, stdev=8661.75
     lat (usec): min=151, max=155752, avg=10453.73, stdev=8661.75
    clat percentiles (usec):
     |  1.00th=[  195],  5.00th=[ 2672], 10.00th=[ 3216], 20.00th=[ 4256],
     | 30.00th=[ 5472], 40.00th=[ 6688], 50.00th=[ 7840], 60.00th=[ 9024],
     | 70.00th=[10560], 80.00th=[15808], 90.00th=[21632], 95.00th=[27264],
     | 99.00th=[43264], 99.50th=[49920], 99.90th=[64256], 99.95th=[69120],
     | 99.99th=[84480]
    bw (KB  /s): min=  608, max= 6816, per=100.00%, avg=764.59, stdev=108.66
    lat (usec) : 250=1.75%, 500=0.11%, 750=0.01%, 1000=0.01%
    lat (msec) : 2=0.41%, 4=15.29%, 10=49.53%, 20=21.91%, 50=10.50%
    lat (msec) : 100=0.49%, 250=0.01%
  cpu          : usr=0.39%, sys=0.41%, ctx=2681382, majf=0, minf=27
  IO depths    : 1=0.1%, 2=100.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.1%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued    : total=r=2621440/w=0/d=0, short=r=0/w=0/d=0

Run status group 0 (all jobs):
  WRITE: io=10240MB, aggrb=158373KB/s, minb=158373KB/s, maxb=158373KB/s, mint=66209msec, maxt=66209msec

Run status group 1 (all jobs):
   READ: io=10240MB, aggrb=46730KB/s, minb=46730KB/s, maxb=46730KB/s, mint=224389msec, maxt=224389msec

Run status group 2 (all jobs):
   READ: io=10240MB, aggrb=764KB/s, minb=764KB/s, maxb=764KB/s, mint=13719455msec, maxt=13719455msec
--
To unsubscribe from this list: send the line "unsubscribe fio" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Kernel]     [Linux SCSI]     [Linux IDE]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux SCSI]

  Powered by Linux