I'm doing some powerfail recovery testing on a storage array over iSCSI.
Host is RHEL 6.4 kernel 2.6.32-358.el6.x86_64.
With fio 2.1.2 -> 2.1.4 the job file below rides through the disks going
away, and continues I/O after they come back, without reporting any errors.
With fio 2.1.5 -> 2.1.8 when the disks come back fio immediately reports
a meta verification error.
I captured a trace with an finisar analyzer, and can see that after the
disks come back and the host logs back in, a read is issued for an lba
which was never written to.
Since I don't see verification errors outside of the powerfail testing,
I suspect fio isn't correctly handling failed writes during the time the
disks are unavailable.
The trace file is rather large, but I can make it available if you need
to see it.
[whee]
bs=8k
thread=4
time_based=1
runtime=864000
readwrite=randrw
direct=1
iodepth=128
ioengine=libaio
size=100%
verify=meta
do_verify=1
verify_fatal=1
verify_dump=1
verify_backlog=8192
buffer_compress_percentage=95
ignore_error=ENODEV:EIO,ENODEV:EIO,ENODEV:EIO
filename=/dev/mapper/lun0
.
.
filename=/dev/mapper/lun9
--
To unsubscribe from this list: send the line "unsubscribe fio" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html