Re: [RFC 1/1] fix NULL mnt [was Re: apparmor NULL pointer dereference on resume [efivarfs]]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, 2025-03-15 at 11:04 +0100, Christian Brauner wrote:
[...]
> Since efivars uses a single global superblock and we know that sfi-
> >sb is still alive (After all we've just pinned it above.)
> vfs_kern_mount() will reuse the same superblock.
> 
> There's two cases to consider:
> 
> (1) vfs_kern_mount() was successful. In this case path->mnt will hold
> an active superblock reference that will be released asynchronously
> via __fput(). That is safe and correct.
> 
> (2) vfs_kern_mount() fails. That's an issue because you need to call
> deactivate_super() which will have a similar deadlock problem.
> If efivarfs_pm_notify() now holds the last reference to the
> superblock then deactivate_super() super will put that last
> reference and call efivarfs_kill_super() which in turn will wait for
> efivarfs_pm_notify() to finish. => deadlock
> 
> So in the error case you need to offload the call to
> deactivate_super() to a workqueue.

OK, got that (although it did make my head explode a bit ... this is
certainly subtle stuff).  To do the delayed work for the
deactivate_super(), I hijacked the superblock destroy_work structure
which I think is safe because by the time the work structure is
executed, we own it and so it can be reused for the actual destroy_work
in deactivate_super().

However, there's another problem: the mntput after kernel_file_open may
synchronously call cleanup_mnt() (and thus deactivate_super()) if the
open fails because it's marked MNT_INTERNAL, which is caused by
SB_KERNMOUNT.  I fixed this just by not passing the SB_KERNMOUNT flag,
which feels a bit hacky.

I've put together everything at the bottom, however, I can't test the
error legs of this because trying to trigger and unmount and hybernate
at exactly the right point is pretty much impossible.  The rest seems
to work as advertised, although I would like a tested-by from the
apparmour people because I do run apparmour in my debian test rig but
don't see the problem.

Regards,

James

---

diff --git a/fs/efivarfs/super.c b/fs/efivarfs/super.c
index 6eae8cf655c1..2d826e98066b 100644
--- a/fs/efivarfs/super.c
+++ b/fs/efivarfs/super.c
@@ -474,12 +474,25 @@ static int efivarfs_check_missing(efi_char16_t *name16, efi_guid_t vendor,
 	return err;
 }
 
+static void efivarfs_deactivate_super_work(struct work_struct *work)
+{
+	struct super_block *s = container_of(work, struct super_block,
+					     destroy_work);
+	/*
+	 * note: here s->destroy_work is free for reuse (which
+	 * will happen in deactivate_super)
+	 */
+	deactivate_super(s);
+}
+
+static struct file_system_type efivarfs_type;
+
 static int efivarfs_pm_notify(struct notifier_block *nb, unsigned long action,
 			      void *ptr)
 {
 	struct efivarfs_fs_info *sfi = container_of(nb, struct efivarfs_fs_info,
 						    pm_nb);
-	struct path path = { .mnt = NULL, .dentry = sfi->sb->s_root, };
+	struct path path;
 	struct efivarfs_ctx ectx = {
 		.ctx = {
 			.actor	= efivarfs_actor,
@@ -487,6 +500,7 @@ static int efivarfs_pm_notify(struct notifier_block *nb, unsigned long action,
 		.sb = sfi->sb,
 	};
 	struct file *file;
+	struct super_block *s = sfi->sb;
 	static bool rescan_done = true;
 
 	if (action == PM_HIBERNATION_PREPARE) {
@@ -499,11 +513,39 @@ static int efivarfs_pm_notify(struct notifier_block *nb, unsigned long action,
 	if (rescan_done)
 		return NOTIFY_DONE;
 
+	/* ensure single superblock is alive and pin it */
+	if (!atomic_inc_not_zero(&s->s_active))
+		return NOTIFY_DONE;
+
 	pr_info("efivarfs: resyncing variable state\n");
 
-	/* O_NOATIME is required to prevent oops on NULL mnt */
+	path.dentry = sfi->sb->s_root;
+
+	/* do not add SB_KERNMOUNT which causes MNT_INTERNAL, see below */
+	path.mnt = vfs_kern_mount(&efivarfs_type, 0,
+				  efivarfs_type.name, NULL);
+	if (IS_ERR(path.mnt)) {
+		pr_err("efivarfs: internal mount failed\n");
+		/*
+		 * We may be the last pinner of the superblock but
+		 * calling efivarfs_kill_sb from within the notifier
+		 * here would deadlock trying to unregister it
+		 */
+		INIT_WORK(&s->destroy_work, efivarfs_deactivate_super_work);
+		schedule_work(&s->destroy_work);
+		return PTR_ERR(path.mnt);
+	}
+
+	/* path.mnt now has pin on superblock, so this must be above one */
+	atomic_dec(&s->s_active);
+
 	file = kernel_file_open(&path, O_RDONLY | O_DIRECTORY | O_NOATIME,
 				current_cred());
+	/*
+	 * safe even if last put because no MNT_INTERNAL means this
+	 * will do delayed deactivate_super and not deadlock
+	 */
+	mntput(path.mnt);
 	if (IS_ERR(file))
 		return NOTIFY_DONE;
 







[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux