On Mon, Jun 20, 2016 at 3:31 PM, Jan Kara <jack@xxxxxxx> wrote: > Hi, > > On Fri 17-06-16 12:04:05, Tejun Heo wrote: >> 43d1c0eb7e11 ("block: detach bdev inode from its wb in >> __blkdev_put()") detached bdev inode from its wb as the bdev inode may >> outlive the underlying bdi and thus the wb. This is accomplished by >> invoking inode_detach_wb() from __blkdev_put(); however, while the >> inode can't be dirtied by the time control reaches there, that doesn't >> guarantee that writeback isn't already in progress on the inode. This >> can lead to the inode being disassociated from its wb while writeback >> operation is in flight causing oopses like the following. > > <snip> > >> Fix it by flushing the wb dwork before detaching the inode. Combined >> with the fact that the inode can no longer be dirtied, this guarantees >> that no writeback operation can be in flight or initiated. > > Sorry for the late reply but now when thinking about the patch I don't > think it is quite right. Writeback can happen from other contexts than just > the worker one (e.g. kswapd can do writeback, or in some out-of-memory > situations we may punt to doing writeback directly instead of calling the > worker, or sync(2) calls fdatawrite() for block device inode directly when > iterating through blockdev superblock). So flushing the workqueue IMHO is > not covering 100% of cases. > > I wanted to suggest to use inode_wait_for_writeback() which will make sure > writeback is done with the inode. However we effectively already do this > by calling bdev_write_inode() which calls writeback_single_inode() in > WB_SYNC_ALL mode. So by the time that call completes we are sure writeback > code is not looking at the inode. *However* what I think is happening is > that sync_blockdev() writes all the dirty pages, bdev_write_inode() writes > the inode and clears all dirty bits, however the inode still stays in the > b_dirty / b_io list of the wb because it has I_DIRTY_TIME set. Subsequently > once flusher work runs, it finds the inode, looks at it and boom. And the > problem seems to be that write_inode_now(inode, true) does not result in > I_DIRTY_TIME being cleared. > > Attached patch should fix this issue - it is compile-tested only. Dmitry, > can you check whether this patch fixes the issue for you as well? I can't directly test it because crash happened very infrequently. If Tejun/Ted agree that it is the right way to fix it, then I can patch it, restart the fuzzer and leave it running for a while. -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html