Hi Chris, On Tue, 2013-05-14 at 21:22 +0100, Chris Boot wrote: > Hi folks, > > I recently booted my home LIO FC target box (err, yes, home) with a > 3.9.1 kernel, and found some very strange and erroneous behaviour from > the target code. > > The target appears to accept writes but the data never makes it to disk. > When the same blocks are read again, the data is the same as it was > before the write. > > This strange behaviour leads me to be able to boot a (FC initator) > machine correctly off the target, install a few updates, reboot - just > to find the machine snapped back to the state it was in before I booted > it up. I did this three times, and began to question my sanity! > Mmmm... > Creating an LVM logical volume on the initiator machine fails: > > parker bootc # lvcreate -L 50g -n test vg_parker > Failed to activate new LV. > Unable to deactivate failed new LV. Manual intervention required. > > This appears to be because LVM writes metadata to the physical volume, > then reads it back to check everything is sane - and finds that it is not. > > If I make lots of filesystem changes (such as dd if=/dev/zero > of=/var/tmp/zero.bin bs=1M) then drop caches with 'echo 3 > > /proc/sys/vm/drop_caches', I almost immediately get a journal abort on > the filesystem. > > Note that if I reboot the machine it's snapped back to its previous > state so I don't actually get any filesystem corruption at all - just > writes that get written to /dev/null. > > I'm currently remote from the target machine so I can't get on with > major work like a git bisect, but I wanted to post this to the ML as > soon as possible to get people looking at it. > Ok, I'm able to reproduce this on v3.9-rc3 (target-pending/queue), and the offending change that went in v3.9-rc1 code is: commit d0c8b259f8970d39354c1966853363345d401330 Author: Nicholas Bellinger <nab@xxxxxxxxxxxxxxx> Date: Tue Jan 29 22:10:06 2013 -0800 target/iblock: Use backend REQ_FLUSH hint for WriteCacheEnabled status The problem is that when the underlying struct block_device for a IBLOCK backend is configured with WCE=1 + DPOFUA=1, the rw = WRITE assignment no longer occurs in iblock_execute_rw(), and rw = 0 is passed to iblock_submit_bios(). Note the WCE=1 + DPOFUA=0, WCE=0 + DPOFUA=1, and WCE=0 + DPOFUA=0 cases are not affected by this regression bug. Here's a patch to addresses the WCE=1 + DPOFUA=1 case that your hitting, that has been tested against a scsi_debug backend. Please verify on your setup, and I'll get this fix queued for mainline ASAP. Thanks for reporting BootC! --nab diff --git a/drivers/target/target_core_iblock.c b/drivers/target/target_core_iblock.c index 07f5f94..aa1620a 100644 --- a/drivers/target/target_core_iblock.c +++ b/drivers/target/target_core_iblock.c @@ -615,6 +615,8 @@ iblock_execute_rw(struct se_cmd *cmd) rw = WRITE_FUA; else if (!(q->flush_flags & REQ_FLUSH)) rw = WRITE_FUA; + else + rw = WRITE; } else { rw = WRITE; } -- To unsubscribe from this list: send the line "unsubscribe target-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html