On Tue, May 29 2018 at 3:51pm -0400, Ross Zwisler <ross.zwisler@xxxxxxxxxxxxxxx> wrote: > Currently the code in dm_dax_direct_access() only checks whether the target > type has a direct_access() operation defined, not whether the underlying > block devices all support DAX. This latter property can be seen by looking > at whether we set the QUEUE_FLAG_DAX request queue flag when creating the > DM device. Wait... I thought DAX support was all or nothing? > This is problematic if we have, for example, a dm-linear device made up of > a PMEM namespace in fsdax mode followed by a ramdisk from BRD. > QUEUE_FLAG_DAX won't be set on the dm-linear device's request queue, but > we have a working direct_access() entry point and the first member of the > dm-linear set *does* support DAX. If you don't have a uniformly capable device then it is very dangerous to advertise that the entire device has a certain capability. That completely bit me in the past with discard (because for every IO I wasn't then checking if the destination device supported discards). It is all well and good that you're adding that check here. But what I don't like is how you're saying QUEUE_FLAG_DAX implies direct_access() operation exists.. yet for raw PMEM namespaces we just discussed how that is a lie. SO this type of change showcases how the QUEUE_FLAG_DAX doesn't _really_ imply direct_access() exists. > This allows the user to create a filesystem on the dm-linear device, and > then mount it with DAX. The filesystem's bdev_dax_supported() test will > pass because it'll operate on the first member of the dm-linear device, > which happens to be a fsdax PMEM namespace. > > All DAX I/O will then fail to that dm-linear device because the lack of > QUEUE_FLAG_DAX prevents fs_dax_get_by_bdev() from working. This means that > the struct dax_device isn't ever set in the filesystem, so > dax_direct_access() will always return -EOPNOTSUPP. Now you've lost me... these past 2 paragraphs. Why can a user mount it is DAX mode? Because bdev_dax_supported() only accesses the first portion (which happens to have DAX capabilities?) Isn't this exactly why you should be checking for QUEUE_FLAG_DAX in the caller (bdev_dax_supported)? Why not use bdev_get_queue() and verify QUEUE_FLAG_DAX is set in there? > By failing out of dm_dax_direct_access() if QUEUE_FLAG_DAX isn't set we let > the filesystem know we don't support DAX at mount time. The filesystem > will then silently fall back and remove the dax mount option, causing it to > work properly. This shouldn't be needed. Again, QUEUE_FLAG_DAX wasn't set.. so don't allow code to falsely try operations that should've been gated by the fact it wasn't set. SO Nack on this patch.. until/unless I'm corrected ;) Thanks, Mike > Signed-off-by: Ross Zwisler <ross.zwisler@xxxxxxxxxxxxxxx> > Fixes: commit 545ed20e6df6 ("dm: add infrastructure for DAX support") > --- > drivers/md/dm.c | 5 ++--- > 1 file changed, 2 insertions(+), 3 deletions(-) > > diff --git a/drivers/md/dm.c b/drivers/md/dm.c > index 0a7b0107ca78..9728433362d1 100644 > --- a/drivers/md/dm.c > +++ b/drivers/md/dm.c > @@ -1050,14 +1050,13 @@ static long dm_dax_direct_access(struct dax_device *dax_dev, pgoff_t pgoff, > > if (!ti) > goto out; > - if (!ti->type->direct_access) > + if (!blk_queue_dax(md->queue)) > goto out; > len = max_io_len(sector, ti) / PAGE_SECTORS; > if (len < 1) > goto out; > nr_pages = min(len, nr_pages); > - if (ti->type->direct_access) > - ret = ti->type->direct_access(ti, pgoff, nr_pages, kaddr, pfn); > + ret = ti->type->direct_access(ti, pgoff, nr_pages, kaddr, pfn); > > out: > dm_put_live_table(md, srcu_idx); > -- > 2.14.3 > -- To unsubscribe from this list: send the line "unsubscribe linux-xfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html