Meta verification regression starting with fio 2.1.5

Stoo Davies <sdavies@xxxxxxxxxx> · Tue, 6 May 2014 11:25:02 -0700

I'm doing some powerfail recovery testing on a storage array over iSCSI.
Host is RHEL 6.4 kernel 2.6.32-358.el6.x86_64.

With fio 2.1.2 -> 2.1.4 the job file below rides through the disks going 
away, and continues I/O after they come back, without reporting any errors.
With fio 2.1.5 -> 2.1.8 when the disks come back fio immediately reports 
a meta verification error.

I captured a trace with an finisar analyzer, and can see that after the 
disks come back and the host logs back in, a read is issued for an lba 
which was never written to.
Since I don't see verification errors outside of the powerfail testing, 
I suspect fio isn't correctly handling failed writes during the time the 
disks are unavailable.

The trace file is rather large, but I can make it available if you need 
to see it.

[whee]
bs=8k
thread=4
time_based=1
runtime=864000
readwrite=randrw
direct=1
iodepth=128
ioengine=libaio
size=100%
verify=meta
do_verify=1
verify_fatal=1
verify_dump=1
verify_backlog=8192
buffer_compress_percentage=95
ignore_error=ENODEV:EIO,ENODEV:EIO,ENODEV:EIO
filename=/dev/mapper/lun0
.
.
filename=/dev/mapper/lun9

--
To unsubscribe from this list: send the line "unsubscribe fio" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html