On Fri, Jan 29, 2016 at 10:01:13PM -0800, Dan Williams wrote: > On Fri, Jan 29, 2016 at 9:28 PM, Matthew Wilcox <willy@xxxxxxxxxxxxxxx> wrote: > > If we store the PFN of the underlying page instead, we don't have this > > problem. Instead, we have a different problem; of the device going > > away under us. I'm trying to find the code which tears down PTEs when > > the device goes away, and I'm not seeing it. What do we do about user > > mappings of the device? > > I deferred the dax tear down code until next cycle as Al rightly > pointed out some needed re-works: > > https://lists.01.org/pipermail/linux-nvdimm/2016-January/003995.html Thanks; I eventually found it in my email somewhere over the Pacific. I did probably 70% of the work needed to switch the radix tree over to storing PFNs instead of sectors. It seems viable, though it's a big change from where we are today: fs/dax.c | 415 +++++++++++++++++++++++---------------------- include/linux/dax.h | 3 +- include/linux/pfn_t.h | 33 +++- include/linux/radix-tree.h | 9 - 4 files changed, 236 insertions(+), 224 deletions(-) I'll try and get that finished off this week. One concrete and easily-separable piece is that dax_clear_blocks() has the wrong signature. It currently takes an inode & block as parameters; it has no way of finding out the correct block device. It's only two callers are filesystems (ext2 and xfs). Those filesystems should be passing the block_device instead of the inode. But without the inode, we can't convert a block number to a sector number, so we also need to pass the sector number, not the block number. It still has type sector_t, annoyingly. @@ -63,12 +238,11 @@ static void dax_unmap_atomic(struct block_device *bdev, * and hence this means the stack from this point must follow GFP_NOFS * semantics for all operations. */ -int dax_clear_blocks(struct inode *inode, sector_t block, long _size) +int dax_clear_blocks(struct block_device *bdev, sector_t sector, long size) { - struct block_device *bdev = inode->i_sb->s_bdev; struct blk_dax_ctl dax = { - .sector = block << (inode->i_blkbits - 9), - .size = _size, + .sector = sector, + .size = size, }; might_sleep(); but I haven't looked at doing the conversion of xfs or ext2 to use that new interface. -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html