Re: [PATCH v2 4/7] dm: prevent DAX mounts if not supported

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, May 29 2018 at  3:51pm -0400,
Ross Zwisler <ross.zwisler@xxxxxxxxxxxxxxx> wrote:

> Currently the code in dm_dax_direct_access() only checks whether the target
> type has a direct_access() operation defined, not whether the underlying
> block devices all support DAX.  This latter property can be seen by looking
> at whether we set the QUEUE_FLAG_DAX request queue flag when creating the
> DM device.

Wait... I thought DAX support was all or nothing?

> This is problematic if we have, for example, a dm-linear device made up of
> a PMEM namespace in fsdax mode followed by a ramdisk from BRD.
> QUEUE_FLAG_DAX won't be set on the dm-linear device's request queue, but
> we have a working direct_access() entry point and the first member of the
> dm-linear set *does* support DAX.

If you don't have a uniformly capable device then it is very dangerous
to advertise that the entire device has a certain capability.  That
completely bit me in the past with discard (because for every IO I
wasn't then checking if the destination device supported discards).

It is all well and good that you're adding that check here.  But what I
don't like is how you're saying QUEUE_FLAG_DAX implies direct_access()
operation exists.. yet for raw PMEM namespaces we just discussed how
that is a lie.

SO this type of change showcases how the QUEUE_FLAG_DAX doesn't _really_
imply direct_access() exists.

> This allows the user to create a filesystem on the dm-linear device, and
> then mount it with DAX.  The filesystem's bdev_dax_supported() test will
> pass because it'll operate on the first member of the dm-linear device,
> which happens to be a fsdax PMEM namespace.
> 
> All DAX I/O will then fail to that dm-linear device because the lack of
> QUEUE_FLAG_DAX prevents fs_dax_get_by_bdev() from working.  This means that
> the struct dax_device isn't ever set in the filesystem, so
> dax_direct_access() will always return -EOPNOTSUPP.

Now you've lost me... these past 2 paragraphs.  Why can a user mount it
is DAX mode?  Because bdev_dax_supported() only accesses the first
portion (which happens to have DAX capabilities?)

Isn't this exactly why you should be checking for QUEUE_FLAG_DAX in the
caller (bdev_dax_supported)?  Why not use bdev_get_queue() and verify
QUEUE_FLAG_DAX is set in there?

> By failing out of dm_dax_direct_access() if QUEUE_FLAG_DAX isn't set we let
> the filesystem know we don't support DAX at mount time.  The filesystem
> will then silently fall back and remove the dax mount option, causing it to
> work properly.

This shouldn't be needed.  Again, QUEUE_FLAG_DAX wasn't set.. so don't
allow code to falsely try operations that should've been gated by the
fact it wasn't set.

SO Nack on this patch.. until/unless I'm corrected ;)

Thanks,
Mike


> Signed-off-by: Ross Zwisler <ross.zwisler@xxxxxxxxxxxxxxx>
> Fixes: commit 545ed20e6df6 ("dm: add infrastructure for DAX support")
> ---
>  drivers/md/dm.c | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> index 0a7b0107ca78..9728433362d1 100644
> --- a/drivers/md/dm.c
> +++ b/drivers/md/dm.c
> @@ -1050,14 +1050,13 @@ static long dm_dax_direct_access(struct dax_device *dax_dev, pgoff_t pgoff,
>  
>  	if (!ti)
>  		goto out;
> -	if (!ti->type->direct_access)
> +	if (!blk_queue_dax(md->queue))
>  		goto out;
>  	len = max_io_len(sector, ti) / PAGE_SECTORS;
>  	if (len < 1)
>  		goto out;
>  	nr_pages = min(len, nr_pages);
> -	if (ti->type->direct_access)
> -		ret = ti->type->direct_access(ti, pgoff, nr_pages, kaddr, pfn);
> +	ret = ti->type->direct_access(ti, pgoff, nr_pages, kaddr, pfn);
>  
>   out:
>  	dm_put_live_table(md, srcu_idx);
> -- 
> 2.14.3
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux