On Tue, Mar 29, 2022 at 11:01 PM Christoph Hellwig <hch@xxxxxxxxxxxxx> wrote: > > > @@ -1892,6 +1893,8 @@ xfs_free_buftarg( > > list_lru_destroy(&btp->bt_lru); > > > > blkdev_issue_flush(btp->bt_bdev); > > + if (btp->bt_daxdev) > > + dax_unregister_holder(btp->bt_daxdev, btp->bt_mount); > > fs_put_dax(btp->bt_daxdev); > > > > kmem_free(btp); > > @@ -1939,6 +1942,7 @@ xfs_alloc_buftarg( > > struct block_device *bdev) > > { > > xfs_buftarg_t *btp; > > + int error; > > > > btp = kmem_zalloc(sizeof(*btp), KM_NOFS); > > > > @@ -1946,6 +1950,14 @@ xfs_alloc_buftarg( > > btp->bt_dev = bdev->bd_dev; > > btp->bt_bdev = bdev; > > btp->bt_daxdev = fs_dax_get_by_bdev(bdev, &btp->bt_dax_part_off); > > + if (btp->bt_daxdev) { > > + error = dax_register_holder(btp->bt_daxdev, mp, > > + &xfs_dax_holder_operations); > > + if (error) { > > + xfs_err(mp, "DAX device already in use?!"); > > + goto error_free; > > + } > > + } > > It seems to me that just passing the holder and holder ops to > fs_dax_get_by_bdev and the holder to dax_unregister_holder would > significantly simply the interface here. > > Dan, what do you think? Yes, makes sense, just like the optional holder arguments to blkdev_get_by_*(). > > > +#if IS_ENABLED(CONFIG_MEMORY_FAILURE) && IS_ENABLED(CONFIG_FS_DAX) > > No real need for the IS_ENABLED. Also any reason to even build this > file if the options are not set? It seems like > xfs_dax_holder_operations should just be defined to NULL and the > whole file not supported if we can't support the functionality. > > Dan: not for this series, but is there any reason not to require > MEMORY_FAILURE for DAX to start with? Given that DAX ties some storage semantics to memory and storage supports EIO I can see an argument to require memory_failure() for DAX, and especially for DAX on CXL where hotplug is supported it will be necessary. Linux currently has no facility to consult PCI drivers about removal actions, so the only recourse for a force removed CXL device is mass memory_failure(). > > > + > > + ddev_start = mp->m_ddev_targp->bt_dax_part_off; > > + ddev_end = ddev_start + > > + (mp->m_ddev_targp->bt_bdev->bd_nr_sectors << SECTOR_SHIFT) - 1; > > This should use bdev_nr_bytes. > > But didn't we say we don't want to support notifications on partitioned > devices and thus don't actually need all this? Right.