Re: [RFC 1/1] fix NULL mnt [was Re: apparmor NULL pointer dereference on resume [efivarfs]]

James Bottomley <James.Bottomley@xxxxxxxxxxxxxxxxxxxxx> · Sun, 16 Mar 2025 10:26:12 -0400

On Sun, 2025-03-16 at 07:46 +0100, Christian Brauner wrote:
> On Sat, Mar 15, 2025 at 02:41:43PM -0400, James Bottomley wrote:
[...]
> > However, there's another problem: the mntput after kernel_file_open
> > may synchronously call cleanup_mnt() (and thus deactivate_super())
> > if the open fails because it's marked MNT_INTERNAL, which is caused
> > by SB_KERNMOUNT.  I fixed this just by not passing the SB_KERNMOUNT
> > flag, which feels a bit hacky.
> 
> It actually isn't. We know that vfs_kern_mount() will always
> resurface the single superblock that's exposed to userspace because
> we've just taken a reference to it earlier in efivarfs_pm_notify().
> So that SB_KERNMOUNT flag is ignored because no new superblock is
> allocated. It would only matter if we'd end up allocating a new
> superblock which we never do.

I agree with the above: fc->sb_flags never propagates to the existing
superblock.  However, nothing propagates the superblock flags back to
fc->sb_flags either.  The check in vfs_create_mount() is on fc-
>sb_flags.  Since the code is a bit hard to follow I added a printk on
the path.mnt flags and sure enough it comes back with MNT_INTERNAL when
SB_KERNMOUNT is set.

> And if we did it would be a bug because the superblock we allocate
> could be reused at any time if a userspace task mounts efivarfs
> before efivarfs_pm_notify() has destroyed it (or the respective
> workqueue). But that superblock would then have SB_KERNMOUNT for
> something that's not supposed to be one.

True, but the flags don't propagate to the superblock, so no bug.

> And whether or not that helper mount has MNT_INTERNAL is immaterial
> to what you're doing here afaict.

I think the problem is the call chain mntput() -> mntput_no_expire()
which directly calls cleanup_mnt() -> deactivate_super() if that flag
is set.  Though I don't see that kernel_file_open() could ever fail
except for some catastrophic reason like out of memory, so perhaps it
isn't worth quibbling about.

> So not passing the SB_KERNMOUNT flag is the right thing (see devtmpfs
> as well). You could slap a comment in here explaining that we never
> allocate a new superblock so it's clear to people not familiar with
> this particular code.

OK, so you agree that the code as written looks correct? Even if we
don't necessarily quite agree on why.

Regards,

James