Re: [PATCH 2/2] efivarfs: add variable resync after hibernation

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 2025-02-28 at 16:44 +0000, Jon Hunter wrote:
> Hi James,
> 
> On 07/01/2025 21:31, James Bottomley wrote:
> > Hibernation allows other OSs to boot and thus the variable state
> > might
> > be altered by the time the hibernation image is resumed.  Resync
> > the
> > variable state by looping over all the dentries and update the size
> > (in case of alteration) delete any which no-longer exist.  Finally,
> > loop over all efi variables creating any which don't have
> > corresponding dentries.
> > 
> > Signed-off-by: James Bottomley
> > <James.Bottomley@xxxxxxxxxxxxxxxxxxxxx>
> > ---
> >   fs/efivarfs/internal.h |   3 +-
> >   fs/efivarfs/super.c    | 151
> > ++++++++++++++++++++++++++++++++++++++++-
> >   fs/efivarfs/vars.c     |   5 +-
> >   3 files changed, 155 insertions(+), 4 deletions(-)
> 
> ...
>   
> > +static int efivarfs_pm_notify(struct notifier_block *nb, unsigned
> > long action,
> > +			      void *ptr)
> > +{
> > +	struct efivarfs_fs_info *sfi = container_of(nb, struct
> > efivarfs_fs_info,
> > +						    pm_nb);
> > +	struct path path = { .mnt = NULL, .dentry = sfi->sb-
> > >s_root, };
> > +	struct efivarfs_ctx ectx = {
> > +		.ctx = {
> > +			.actor	= efivarfs_actor,
> > +		},
> > +		.sb = sfi->sb,
> > +	};
> > +	struct file *file;
> > +	static bool rescan_done = true;
> > +
> > +	if (action == PM_HIBERNATION_PREPARE) {
> > +		rescan_done = false;
> > +		return NOTIFY_OK;
> > +	} else if (action != PM_POST_HIBERNATION) {
> > +		return NOTIFY_DONE;
> > +	}
> > +
> > +	if (rescan_done)
> > +		return NOTIFY_DONE;
> > +
> > +	pr_info("efivarfs: resyncing variable state\n");
> > +
> > +	/* O_NOATIME is required to prevent oops on NULL mnt */
> > +	file = kernel_file_open(&path, O_RDONLY | O_DIRECTORY |
> > O_NOATIME,
> > +				current_cred());
> > +	if (!file)
> > +		return NOTIFY_DONE;
> > +
> > +	rescan_done = true;
> > +
> > +	/*
> > +	 * First loop over the directory and verify each entry
> > exists,
> > +	 * removing it if it doesn't
> > +	 */
> > +	file->f_pos = 2;	/* skip . and .. */
> > +	do {
> > +		ectx.dentry = NULL;
> > +		iterate_dir(file, &ectx.ctx);
> > +		if (ectx.dentry) {
> > +			pr_info("efivarfs: removing variable
> > %pd\n",
> > +				ectx.dentry);
> > +			simple_recursive_removal(ectx.dentry,
> > NULL);
> > +			dput(ectx.dentry);
> > +		}
> > +	} while (ectx.dentry);
> > +	fput(file);
> > +
> > +	/*
> > +	 * then loop over variables, creating them if there's no
> > matching
> > +	 * dentry
> > +	 */
> > +	efivar_init(efivarfs_check_missing, sfi->sb, false);
> > +
> > +	return NOTIFY_OK;
> > +}
> 
> 
> With the current mainline I have observed the following crash when
> testing suspend on one of our Tegra devices ...
> 
> rtcwake: wakeup from "mem" using /dev/rtc0 at Fri Feb 28 16:25:55
> 2025
> [  246.593485] Unable to handle kernel NULL pointer dereference at
> virtual address 0000000000000068
> [  246.602601] Mem abort info:
> [  246.602603]   ESR = 0x0000000096000004
> [  246.602605]   EC = 0x25: DABT (current EL), IL = 32 bits
> [  246.602608]   SET = 0, FnV = 0
> [  246.602610]   EA = 0, S1PTW = 0
> [  246.602612]   FSC = 0x04: level 0 translation fault
> [  246.602615] Data abort info:
> [  246.602617]   ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000
> [  246.634959]   CM = 0, WnR = 0, TnD = 0, TagAccess = 0
> [  246.634961]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
> [  246.634964] user pgtable: 4k pages, 48-bit VAs,
> pgdp=0000000105205000
> [  246.634967] [0000000000000068] pgd=0000000000000000,
> p4d=0000000000000000
> [  246.634974] Internal error: Oops: 0000000096000004 [#1] PREEMPT
> SMP
> [  246.665796] Modules linked in: qrtr bridge stp llc usb_f_ncm
> usb_f_mass_storage usb_f_acm u_serial usb_f_rndis u_ether
> libcomposite tegra_drm btusb btrtl drm_dp_aux_bus btintel cec nvme
> btmtk nvme_core btbcm drm_display_helper bluetoot
> h snd_soc_tegra210_admaif drm_client_lib snd_soc_tegra210_dmic
> snd_soc_tegra186_asrc snd_soc_tegra210_mixer snd_soc_tegra_pcm
> snd_soc_tegra210_mvc snd_soc_tegra210_ope snd_soc_tegra210_adx
> snd_soc_tegra210_amx snd_soc_tegra210_sfc snd_soc
> _tegra210_i2s drm_kms_helper ecdh_generic tegra_se ucsi_ccg ecc
> snd_hda_codec_hdmi typec_ucsi snd_soc_tegra_audio_graph_card
> snd_soc_tegra210_ahub rfkill tegra210_adma snd_hda_tegra
> crypto_engine snd_soc_audio_graph_card typec snd_hda_cod
> ec snd_soc_simple_card_utils tegra_aconnect arm_dsu_pmu
> snd_soc_rt5640 snd_hda_core tegra_xudc ramoops phy_tegra194_p2u
> snd_soc_rl6231 at24 pcie_tegra194 tegra_bpmp_thermal host1x
> reed_solomon lm90 pwm_tegra ina3221 pwm_fan fuse drm backl
> ight dm_mod ip_tables x_tables ipv6
> [  246.756182] CPU: 9 UID: 0 PID: 1255 Comm: rtcwake Not tainted
> 6.14.0-rc4-g8d538b296d56 #61
> [  246.764677] Hardware name: NVIDIA NVIDIA Jetson AGX Orin Developer
> Kit/Jetson, BIOS 00.0.0-dev-main_88214_5a0f5_a213e 02/26/2025
> [  246.776569] pstate: 20400009 (nzCv daif +PAN -UAO -TCO -DIT -SSBS
> BTYPE=--)
> [  246.783718] pc : efivarfs_pm_notify+0x48/0x180
> [  246.788285] lr : blocking_notifier_call_chain_robust+0x78/0xe4
> [  246.794286] sp : ffff800085cebb60
> [  246.797684] x29: ffff800085cebb60 x28: ffff000093f1b480 x27:
> 0000000000000000
> [  246.805021] x26: 0000000000000004 x25: ffff8000828d3638 x24:
> 0000000000000003
> [  246.812355] x23: 0000000000000000 x22: 0000000000000005 x21:
> 0000000000000006
> [  246.819694] x20: ffff000087f698c0 x19: 0000000000000003 x18:
> 000000007f19bcda
> [  246.827029] x17: 00000000c42545a5 x16: 000000000000001c x15:
> 000000001709e0a9
> [  246.834372] x14: 00000000ac6b3a37 x13: 18286bf36c021b08 x12:
> 00000039694cf81c
> [  246.841713] x11: 00000000f1e0faad x10: 0000000000000001 x9 :
> 000000004ff99d57
> [  246.849046] x8 : 00000000bb51f9d6 x7 : 00000001f4d4185c x6 :
> 00000001f4d4185c
> [  246.856382] x5 : ffff800085cebb58 x4 : ffff800080952930 x3 :
> ffff8000804cba44
> [  246.863713] x2 : 0000000000000000 x1 : 0000000000000003 x0 :
> ffff8000804cbf84
> [  246.871045] Call trace:
> [  246.873550]  efivarfs_pm_notify+0x48/0x180 (P)
> [  246.878119]  blocking_notifier_call_chain_robust+0x78/0xe4
> [  246.883753]  pm_notifier_call_chain_robust+0x28/0x48
> [  246.888852]  pm_suspend+0x138/0x1c8
> [  246.892438]  state_store+0x8c/0xfc
> [  246.895931]  kobj_attr_store+0x18/0x2c
> [  246.899791]  sysfs_kf_write+0x44/0x54
> [  246.903553]  kernfs_fop_write_iter+0x118/0x1a8
> [  246.908113]  vfs_write+0x2b0/0x35c
> [  246.911608]  ksys_write+0x68/0xfc
> [  246.915013]  __arm64_sys_write+0x1c/0x28
> [  246.919038]  invoke_syscall+0x48/0x110
> [  246.922897]  el0_svc_common.constprop.0+0x40/0xe8
> [  246.927731]  do_el0_svc+0x20/0x2c
> [  246.931127]  el0_svc+0x30/0xd0
> [  246.934265]  el0t_64_sync_handler+0x144/0x168
> [  246.938737]  el0t_64_sync+0x198/0x19c
> [  246.942505] Code: f9400682 f90027ff a906ffe2 f100043f (f9403440)
> [  246.948767] ---[ end trace 0000000000000000 ]---
> 
> 
> Bisect is pointing to this commit. I had a quick look at this and the
> following fixes it for me ...

Yes, this occurs because the notifier can be triggered before the
superblock information is filled.  Ard fixed this:

https://web.git.kernel.org/pub/scm/linux/kernel/git/efi/efi.git/commit/?h=urgent&id=cb6ae457bc6af58c84a7854df5e7e32ba1c6a715

Regards,

James






[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux