Bug in fua code

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Ming,

I recently discovered a bug in the FUA code - a recent bcachefs change exposed
it - and my best guess is it's related to your recent changes to blk-flush.c.

What I'm seeing is if all writes are issued as FUA writes, in a short period of
time the request queue get stuck - writes are on the queue but they aren't being
issued or completed. This is with an AHCI device - so no blk-mq, and it's
emulating FUA with flushes.

You ought to be able to reproduce this yourself by changing
generic_make_request() to make all writes FUA, and then just doing O_DIRECT
writes with dd or something. I suspect that if there's non FUA flushes being
issued they'll end up kicking the queue and keeping things from getting stuck,
in my testing I'm only seeing things get completely stuck when testing bcachefs
in multi device mode, with no metadata or journal IO to the device in question,
just FUA data writes.

After things get stuck, with kgdb I'm seeing a request on the request queue that
has flush_data_end_io for its endio function. I've still been trying to figure
out how the flush machinery is supposed to work, I don't know what else you'd
want to know.

Much appreciated if you could take a look.
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux