Changes since v1: [1] 1/ Dropped the patches that were merged for v4.4-rc2 2/ Introduce a new super-block flag that filesystems can use to error out early when there is no longer a backing device available. Use it to prevent a spurious warning triggered by ext4 on surprise removal. (Dave) 3/ Include the unmap_partition implementation initially posted here [2]. [1]: https://lists.01.org/pipermail/linux-nvdimm/2015-November/002876.html [2]: https://lists.01.org/pipermail/linux-nvdimm/2015-November/002922.html Testing this patch set reveals that xfs needs more XFS_FORCED_SHUTDOWN checks, especially in the unmount path. Currently we deadlock here on umount after block device removal: INFO: task umount:2187 blocked for more than 120 seconds. Tainted: G O 4.4.0-rc2+ #1953 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. umount D ffff8800d2fbfd70 0 2187 2095 0x00000080 ffff8800d2fbfd70 ffffffff81f94f98 ffff88031fc97bd8 ffff88030af5ad80 ffff8800db71db00 ffff8800d2fc0000 ffff8800db8dbde0 ffff8800d93b6708 ffff8800d93b6760 ffff8800d93b66d8 ffff8800d2fbfd88 ffffffff818f0695 Call Trace: [<ffffffff818f0695>] schedule+0x35/0x80 [<ffffffffa01e134e>] xfs_ail_push_all_sync+0xbe/0x110 [xfs] [<ffffffff810ecc30>] ? wait_woken+0x80/0x80 [<ffffffffa01c8d91>] xfs_unmountfs+0x81/0x1b0 [xfs] [<ffffffffa01c991b>] ? xfs_mru_cache_destroy+0x6b/0x90 [xfs] [<ffffffffa01cbf30>] xfs_fs_put_super+0x30/0x90 [xfs] [<ffffffff81247eca>] generic_shutdown_super+0x6a/0xf0 Earlier in this trace xfs has already performed: XFS (pmem0m): xfs_do_force_shutdown(0x2) called from line 1197 of file fs/xfs/xfs_log.c. ...but xfs_log_work_queue() continues to run periodically. --- The motivation for these lifetime fixes is to prevent crashes and mapping leaks when using dax. Most of the safety guarantees in this series come from the protection afforded by blk_queue_enter + blk_queue_exit. After a successful blk_queue_enter we can issue any block device operations we want without needing to worry about the block layer infrastructure for the device being torn down. blk_queue_enter is chosen for this "is bdev alive?" check over SB_I_BDI_DEAD or error returns from get_blocks() because it synchronizes with blk_cleanup_queue. SB_I_BDI_DEAD is there to let a file system optionally error out early before getting -ENODEV from the block layer, but it's optional an asynchronous. --- Dan Williams (7): pmem, dax: clean up clear_pmem() dax: increase granularity of dax_clear_blocks() operations dax: guarantee page aligned results from bdev_direct_access() dax: fix lifetime of in-kernel dax mappings with dax_map_atomic() fs: notify superblocks of backing-device death ext4: skip inode dirty when backing device is gone mm, dax: unmap dax mappings at bdev shutdown arch/x86/include/asm/pmem.h | 7 - block/genhd.c | 93 +++++++++++++++-- drivers/block/brd.c | 3 - drivers/nvdimm/pmem.c | 3 - drivers/s390/block/dcssblk.c | 6 - fs/block_dev.c | 73 ++++++++++++-- fs/dax.c | 224 +++++++++++++++++++++++++----------------- fs/fs-writeback.c | 3 + include/linux/blkdev.h | 17 +++ include/linux/fs.h | 3 + include/linux/genhd.h | 1 11 files changed, 304 insertions(+), 129 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html