On Thu, Jun 21, 2012 at 10:25:02AM -0400, Alan Stern wrote: > On Thu, 21 Jun 2012, Dave Chinner wrote: > > > > > As it is, I think that invalidate_partition() is doing something > > > > somewhat insane for a block device that has been removed - you can't > > > > write to it so fsync_bdev() is useless. > > > > > > That depends. If by "removed" you mean physically disconnected from > > > the computer, then yes. But if "removed" means merely unregistered > > > from the device core then writes can still succeed. > > > invalidate_partition() doesn't know which has happened. > > > > Which means the lower layers probably need to pass that distinction > > up to the invalidation function. > > I don't think that information is passed anywhere in the kernel. And > in any case, it's not really important. When a device is unregistered, > the upper layers shouldn't care about the reason why. Then why have filesystem developers been asking for notifications from the block layer that the device has been disconected for the past couple of LSF summits? :) Because we'd much prefer to know that part of the filesystem has just disappeared and can't be used, rather than get back errors every time we try to send an IO to the region that of the filesytem. IO errors can be transient - disconnected block devices are not - and so being able to tell the difference is important to handling storage errors in a robust manner. Think about BTRFS - knowing that a leg of an internal mirror has been pulled out means it can select the other leg for all it's metadata IO rather than just getting IO errors to it, and that it can perhaps allocate a region on another device to mirror all new metadata and avoid the problem altogether. IOWs, there's plenty of good reasons for knowing that a device has been disconnected at the higher layers of the storage stack.... > > > > And another question - why doesn't having an active filesystem on a > > > > block device (i.e. an active reference to the gendisk) prevent the > > > > block device from being removed from underneath it? > > > > > > References prevent data structures from being deallocated, not from > > > being unregistered (or as James Bottomley likes to call it, "removed > > > from visibility"). > > > > Except the unregister path appears to assume that a valid block > > device available when it is unregistered. > > It may very well be available during the unregistration procedure. > There's nothing wrong with assuming it is -- if it isn't, I/O attempts > will simply fail. It's clear that it isn't available, and you're assuming that IO attempts are possible and that they will fail. If that assumption was always valid, then we wouldn't have got this bug report.... > > That seems to me like > > there is a bad assumption being made in this error handling path... > > No; a bad assumption would be if the code assumed the device was > available _after_ the unregistration call had completed. It's known to be unavaiable *during* the unregistration call, and that code is assuming it is available. When a device is forcible unplugged from underenath an active filesytem, there is no guarantee that it can extract itself from the mess that this leaves behind, and assuming that it can is just wrong... Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html