Re: [RFC PATCH v3 2/4] dax: Check for data cache aliasing at runtime

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2024-02-02 14:41, Dan Williams wrote:
Mathieu Desnoyers wrote:
On 2024-02-02 12:37, Dan Williams wrote:
Mathieu Desnoyers wrote:
[...]


The alternative route I intend to take is to audit all callers
of alloc_dax() and make sure they all save the alloc_dax() return
value in a struct dax_device * local variable first for the sake
of checking for IS_ERR(). This will leave the xyz->dax_dev pointer
initialized to NULL in the error case and simplify the rest of
error checking.

I could maybe get on board with that, but it needs a comment somewhere
about the asymmetric subtlety.

Is this "somewhere" at every alloc_dax() call site, or do you have
something else in mind ?

At least kill_dax() should mention the asymmetry I think.

Here is what I intend to add:

 * Note, because alloc_dax() returns an ERR_PTR() on error, callers
 * typically store its result into a local variable in order to check
 * the result. Therefore, care must be taken to populate the struct
 * device dax_dev field make sure the dax_dev is not leaked.


The real question is what to do about device-dax. I *think* it is not
affected by cpu_dcache aliasing because it never accesses user mappings
through a kernel alias. I doubt device-dax is in use on these platforms,
but we might need another fixup for that if someone screams about the
alloc_dax() behavior change making them lose device-dax access.

By "device-dax", I understand you mean drivers/dax/Kconfig:DEV_DAX.

Based on your analysis, is alloc_dax() still the right spot where
to place this runtime check ? Which call sites are responsible
for invoking alloc_dax() for device-dax ?

That is in devm_create_dev_dax().

If we know which call sites do not intend to use the kernel linear
mapping, we could introduce a flag (or a new variant of the alloc_dax()
API) that would either enforce or skip the check.

Hmmm, it looks like there is already a natural flag for that. If
alloc_dax() is passed a NULL operations pointer it means there are no
kernel usages of the aliased mapping. That actually fits rather nicely.

Good, I was reaching the same conclusion when I received your reply.
I'll do that. It ends up being:

        /*
         * Unavailable on architectures with virtually aliased data caches,
         * except for device-dax (NULL operations pointer), which does
         * not use aliased mappings from the kernel.
         */
        if (ops && cpu_dcache_is_aliasing())
                return ERR_PTR(-EOPNOTSUPP);


[..]
@@ -804,6 +808,15 @@ static int virtio_fs_setup_dax(struct virtio_device *vdev, struct virtio_fs *fs)
    	if (!IS_ENABLED(CONFIG_FUSE_DAX))
    		return 0;
+ dax_dev = alloc_dax(fs, &virtio_fs_dax_ops);
+	if (IS_ERR(dax_dev)) {
+		int rc = PTR_ERR(dax_dev);
+
+		if (rc == -EOPNOTSUPP)
+			return 0;
+		return rc;
+	}

What is gained by moving this allocation here ?

The gain is to fail early in virtio_fs_setup_dax() since the fundamental
dependency of alloc_dax() success is not met. For example why let the
setup progress to devm_memremap_pages() when alloc_dax() is going to
return ERR_PTR(-EOPNOTSUPP).

What I don't know is whether there is a dependency requiring to do
devm_request_mem_region(), devm_kzalloc(), devm_memremap_pages()
before calling alloc_dax() ?

Those 3 calls are used to populate:

          fs->window_phys_addr = (phys_addr_t) cache_reg.addr;
          fs->window_len = (phys_addr_t) cache_reg.len;

and then alloc_dax() takes "fs" as private data parameter. So it's
unclear to me whether we can swap the invocation order. I suspect
that it is not an issue because it is only used to populate
dax_dev->private, but I prefer to confirm this with you just to be
on the safe side.

Thanks for that. All of those need to be done before the fs goes live
later in virtio_device_ready(), but before that point nothing should be
calling into virtio_fs_dax_ops, so as far as I can see it is safe to
change the order.

Sounds good, I'll do that.

I will soon be ready to send out a RFC v4, which is still only
compiled-tested. Do you happen to have some kind of test suite
you can use to automate some of the runtime testing ?

Thanks,

Mathieu

--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com





[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux