Re: [PATCH 2/2] efivarfs: add variable resync after hibernation

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 28/02/2025 16:49, James Bottomley wrote:
On Fri, 2025-02-28 at 16:44 +0000, Jon Hunter wrote:
Hi James,

On 07/01/2025 21:31, James Bottomley wrote:
Hibernation allows other OSs to boot and thus the variable state
might
be altered by the time the hibernation image is resumed.  Resync
the
variable state by looping over all the dentries and update the size
(in case of alteration) delete any which no-longer exist.  Finally,
loop over all efi variables creating any which don't have
corresponding dentries.

Signed-off-by: James Bottomley
<James.Bottomley@xxxxxxxxxxxxxxxxxxxxx>
---
   fs/efivarfs/internal.h |   3 +-
   fs/efivarfs/super.c    | 151
++++++++++++++++++++++++++++++++++++++++-
   fs/efivarfs/vars.c     |   5 +-
   3 files changed, 155 insertions(+), 4 deletions(-)

...
+static int efivarfs_pm_notify(struct notifier_block *nb, unsigned
long action,
+			      void *ptr)
+{
+	struct efivarfs_fs_info *sfi = container_of(nb, struct
efivarfs_fs_info,
+						    pm_nb);
+	struct path path = { .mnt = NULL, .dentry = sfi->sb-
s_root, };
+	struct efivarfs_ctx ectx = {
+		.ctx = {
+			.actor	= efivarfs_actor,
+		},
+		.sb = sfi->sb,
+	};
+	struct file *file;
+	static bool rescan_done = true;
+
+	if (action == PM_HIBERNATION_PREPARE) {
+		rescan_done = false;
+		return NOTIFY_OK;
+	} else if (action != PM_POST_HIBERNATION) {
+		return NOTIFY_DONE;
+	}
+
+	if (rescan_done)
+		return NOTIFY_DONE;
+
+	pr_info("efivarfs: resyncing variable state\n");
+
+	/* O_NOATIME is required to prevent oops on NULL mnt */
+	file = kernel_file_open(&path, O_RDONLY | O_DIRECTORY |
O_NOATIME,
+				current_cred());
+	if (!file)
+		return NOTIFY_DONE;
+
+	rescan_done = true;
+
+	/*
+	 * First loop over the directory and verify each entry
exists,
+	 * removing it if it doesn't
+	 */
+	file->f_pos = 2;	/* skip . and .. */
+	do {
+		ectx.dentry = NULL;
+		iterate_dir(file, &ectx.ctx);
+		if (ectx.dentry) {
+			pr_info("efivarfs: removing variable
%pd\n",
+				ectx.dentry);
+			simple_recursive_removal(ectx.dentry,
NULL);
+			dput(ectx.dentry);
+		}
+	} while (ectx.dentry);
+	fput(file);
+
+	/*
+	 * then loop over variables, creating them if there's no
matching
+	 * dentry
+	 */
+	efivar_init(efivarfs_check_missing, sfi->sb, false);
+
+	return NOTIFY_OK;
+}


With the current mainline I have observed the following crash when
testing suspend on one of our Tegra devices ...

rtcwake: wakeup from "mem" using /dev/rtc0 at Fri Feb 28 16:25:55
2025
[  246.593485] Unable to handle kernel NULL pointer dereference at
virtual address 0000000000000068
[  246.602601] Mem abort info:
[  246.602603]   ESR = 0x0000000096000004
[  246.602605]   EC = 0x25: DABT (current EL), IL = 32 bits
[  246.602608]   SET = 0, FnV = 0
[  246.602610]   EA = 0, S1PTW = 0
[  246.602612]   FSC = 0x04: level 0 translation fault
[  246.602615] Data abort info:
[  246.602617]   ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000
[  246.634959]   CM = 0, WnR = 0, TnD = 0, TagAccess = 0
[  246.634961]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[  246.634964] user pgtable: 4k pages, 48-bit VAs,
pgdp=0000000105205000
[  246.634967] [0000000000000068] pgd=0000000000000000,
p4d=0000000000000000
[  246.634974] Internal error: Oops: 0000000096000004 [#1] PREEMPT
SMP
[  246.665796] Modules linked in: qrtr bridge stp llc usb_f_ncm
usb_f_mass_storage usb_f_acm u_serial usb_f_rndis u_ether
libcomposite tegra_drm btusb btrtl drm_dp_aux_bus btintel cec nvme
btmtk nvme_core btbcm drm_display_helper bluetoot
h snd_soc_tegra210_admaif drm_client_lib snd_soc_tegra210_dmic
snd_soc_tegra186_asrc snd_soc_tegra210_mixer snd_soc_tegra_pcm
snd_soc_tegra210_mvc snd_soc_tegra210_ope snd_soc_tegra210_adx
snd_soc_tegra210_amx snd_soc_tegra210_sfc snd_soc
_tegra210_i2s drm_kms_helper ecdh_generic tegra_se ucsi_ccg ecc
snd_hda_codec_hdmi typec_ucsi snd_soc_tegra_audio_graph_card
snd_soc_tegra210_ahub rfkill tegra210_adma snd_hda_tegra
crypto_engine snd_soc_audio_graph_card typec snd_hda_cod
ec snd_soc_simple_card_utils tegra_aconnect arm_dsu_pmu
snd_soc_rt5640 snd_hda_core tegra_xudc ramoops phy_tegra194_p2u
snd_soc_rl6231 at24 pcie_tegra194 tegra_bpmp_thermal host1x
reed_solomon lm90 pwm_tegra ina3221 pwm_fan fuse drm backl
ight dm_mod ip_tables x_tables ipv6
[  246.756182] CPU: 9 UID: 0 PID: 1255 Comm: rtcwake Not tainted
6.14.0-rc4-g8d538b296d56 #61
[  246.764677] Hardware name: NVIDIA NVIDIA Jetson AGX Orin Developer
Kit/Jetson, BIOS 00.0.0-dev-main_88214_5a0f5_a213e 02/26/2025
[  246.776569] pstate: 20400009 (nzCv daif +PAN -UAO -TCO -DIT -SSBS
BTYPE=--)
[  246.783718] pc : efivarfs_pm_notify+0x48/0x180
[  246.788285] lr : blocking_notifier_call_chain_robust+0x78/0xe4
[  246.794286] sp : ffff800085cebb60
[  246.797684] x29: ffff800085cebb60 x28: ffff000093f1b480 x27:
0000000000000000
[  246.805021] x26: 0000000000000004 x25: ffff8000828d3638 x24:
0000000000000003
[  246.812355] x23: 0000000000000000 x22: 0000000000000005 x21:
0000000000000006
[  246.819694] x20: ffff000087f698c0 x19: 0000000000000003 x18:
000000007f19bcda
[  246.827029] x17: 00000000c42545a5 x16: 000000000000001c x15:
000000001709e0a9
[  246.834372] x14: 00000000ac6b3a37 x13: 18286bf36c021b08 x12:
00000039694cf81c
[  246.841713] x11: 00000000f1e0faad x10: 0000000000000001 x9 :
000000004ff99d57
[  246.849046] x8 : 00000000bb51f9d6 x7 : 00000001f4d4185c x6 :
00000001f4d4185c
[  246.856382] x5 : ffff800085cebb58 x4 : ffff800080952930 x3 :
ffff8000804cba44
[  246.863713] x2 : 0000000000000000 x1 : 0000000000000003 x0 :
ffff8000804cbf84
[  246.871045] Call trace:
[  246.873550]  efivarfs_pm_notify+0x48/0x180 (P)
[  246.878119]  blocking_notifier_call_chain_robust+0x78/0xe4
[  246.883753]  pm_notifier_call_chain_robust+0x28/0x48
[  246.888852]  pm_suspend+0x138/0x1c8
[  246.892438]  state_store+0x8c/0xfc
[  246.895931]  kobj_attr_store+0x18/0x2c
[  246.899791]  sysfs_kf_write+0x44/0x54
[  246.903553]  kernfs_fop_write_iter+0x118/0x1a8
[  246.908113]  vfs_write+0x2b0/0x35c
[  246.911608]  ksys_write+0x68/0xfc
[  246.915013]  __arm64_sys_write+0x1c/0x28
[  246.919038]  invoke_syscall+0x48/0x110
[  246.922897]  el0_svc_common.constprop.0+0x40/0xe8
[  246.927731]  do_el0_svc+0x20/0x2c
[  246.931127]  el0_svc+0x30/0xd0
[  246.934265]  el0t_64_sync_handler+0x144/0x168
[  246.938737]  el0t_64_sync+0x198/0x19c
[  246.942505] Code: f9400682 f90027ff a906ffe2 f100043f (f9403440)
[  246.948767] ---[ end trace 0000000000000000 ]---


Bisect is pointing to this commit. I had a quick look at this and the
following fixes it for me ...

Yes, this occurs because the notifier can be triggered before the
superblock information is filled.  Ard fixed this:

https://web.git.kernel.org/pub/scm/linux/kernel/git/efi/efi.git/commit/?h=urgent&id=cb6ae457bc6af58c84a7854df5e7e32ba1c6a715


Thanks for the quick response. I can confirm that this fixes it for me. Feel free to add my ...

Tested-by: Jon Hunter <jonathanh@xxxxxxxxxx>

Cheers
Jon

--
nvpublic





[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux