On Thu, Apr 27, 2023 at 10:20:28AM +0800, Ming Lei wrote: > Hello Guys, > > I got one report in which buffered write IO hangs in balance_dirty_pages, > after one nvme block device is unplugged physically, then umount can't > succeed. The bug here is that the device unplug code has not told the filesystem that it's gone away permanently. This is the same problem we've been having for the past 15 years - when block device goes away permanently it leaves the filesystem and everything else dependent on the block device completely unaware that they are unable to function anymore. IOWs, the block device remove path is relying on -unreliable side effects- of filesystem IO error handling to produce what we'd call "correct behaviour". The block device needs to be shutting down the filesystem when it has some sort of fatal, unrecoverable error like this (e.g. hot unplug). We have the XFS_IOC_GOINGDOWN ioctl for telling the filesystem it can't function anymore. This ioctl (_IOR('X',125,__u32)) has also been replicated into ext4, f2fs and CIFS and it gets exercised heavily by fstests. Hence this isn't XFS specific functionality, nor is it untested functionality. The ioctl should be lifted to the VFS as FS_IOC_SHUTDOWN and a super_operations method added to trigger a filesystem shutdown. That way the block device removal code could simply call sb->s_ops->shutdown(sb, REASON) if it exists rather than sync_filesystem(sb) if there's a superblock associated with the block device. Then all these This way we won't have to spend another two decades of people complaining about how applications and filesystems hang when they pull the storage device out from under them and the filesystem didn't do something that made it notice before the system hung.... > So far only observed on ext4 FS, not see it on XFS. Pure dumb luck - a journal IO failed on XFS (probably during the sync_filesystem() call) and that shut the filesystem down. > I guess it isn't > related with disk type, and not tried such test on other type of disks yet, > but will do. It can happen on any block device based storage that gets pulled from under any filesystem without warning. > Seems like dirty pages aren't cleaned after ext4 bio is failed in this > situation? Yes, because the filesystem wasn't shut down on device removal to tell it that it's allowed to toss away dirty pages as they cannot be cleaned via the IO path.... -Dave. -- Dave Chinner dchinner@xxxxxxxxxx