Looks like TPM. CCing the proper people. On Mon, Oct 14, 2024 at 12:46:26AM +0000, bugzilla-daemon@xxxxxxxxxx wrote: > https://bugzilla.kernel.org/show_bug.cgi?id=219383 > > Bug ID: 219383 > Summary: System reboot on S3 sleep/wakeup test > Product: Platform Specific/Hardware > Version: 2.5 > Hardware: All > OS: Linux > Status: NEW > Severity: normal > Priority: P3 > Component: x86-64 > Assignee: platform_x86_64@xxxxxxxxxxxxxxxxxxxx > Reporter: mikeseohyungjin@xxxxxxxxx > Regression: No > > I'm working for LG laptops, and I have run serveral LG PC with ubuntu OS. You > may know, most LG laptops has intel soc. > I found out a critical issue, system reboot on S3 sleep/wake up. > > Enviornments: > - PC BIOS : Phoenix Technologies > - Intel Jasperlake or Intel Lunarlake > - OS Ubuntu 22.04(Jasperlake), 24.04.1(Lunarlake) > - linux kernel version 6.x.0(Jasperlake) or up-to-date 6.11(Lunarlake) > > Symptom: > > Running the aging scripts like below, system reboots. > ------------------------- > #!/bin/bash > <snip> > for (( i=1; i<=10000 ; i++ )) > sudo rtcwake -m mem -s 10 >> ${LOG} 2>&1 > <snip> > ------------------------- > The scripts works like below, > 1. waits 10 secs > 2. echo mem > /sys/power/state > 3. waits 10 secs again and wake up system like press power button. > > > My analysis: > > I had reproduced several times to find that BIOS side triggered the system > reboots. > | pm_suspend() | syscore_suspend() | acpi_suspend_enter() | ... | < BIOS > | > ...| acpi_suspend_enter() | syscore_resume() | ...| > > Debugging on BIOS, TPM2 can generate cold reset when it detects something wrong > after TPM resuming. > In the BIOS code, if there are active PCR banks that are not supported by the > Platform mask, it supposes to be update the TPM allocations and reboot the > machine. > > It means that something in linux kernel side can effect operations of tpm when > going to sleep. > So, I have debuggered and traced the functions related to tpm, such as > tpm_chip_start whenever the symptoms represented. > > In normal case, tpm_chip_start() called once like below, > tpm_pm_suspend()-> tpm_chip_start(). > but issued case, additionally called below > hwrng_fillfn -> > rng_get_data -> > tpm_hwrng_read -> > tpm_get_random -> > tpm_find_get_ops -> > tpm_try_get_ops -> > tpm_chip_start -> > > I found out that when running hwrng_fillfn(), related to Hardware random number > generator, called during system_sleep, it can cause system reboots. > To Verify it, I have tested with custom kernel which includes below patch. > > ----------------------- > From 373e92bb6d471c5fb42bacb97a4caf5375df5522 Mon Sep 17 00:00:00 2001 > From: mike Seo <mikeseohyungjin@xxxxxxxxx> > Date: Thu, 10 Oct 2024 14:04:57 +0900 > Subject: [PATCH] test_patch > > test_patch for reboot while sleep/wakeup > > Signed-off-by: mike Seo <mikeseohyungjin@xxxxxxxxx> > --- > drivers/char/hw_random/core.c | 21 +++++++++++++++++++++ > 1 file changed, 21 insertions(+) > > diff --git a/drivers/char/hw_random/core.c b/drivers/char/hw_random/core.c > index 57c51efa5..d3f0059a4 100644 > --- a/drivers/char/hw_random/core.c > +++ b/drivers/char/hw_random/core.c > @@ -25,6 +25,7 @@ > #include <linux/slab.h> > #include <linux/string.h> > #include <linux/uaccess.h> > +#include <linux/suspend.h> > > #define RNG_MODULE_NAME "hw_random" > > @@ -469,6 +470,22 @@ static struct attribute *rng_dev_attrs[] = { > > ATTRIBUTE_GROUPS(rng_dev); > > + > +static int hwrng_pm_notification(struct notifier_block *nb, unsigned long > action, void *data) > +{ > + > + switch (action) { > + case PM_SUSPEND_PREPARE: > + is_suspend_prepare = 1; > + break; > + default: > + is_suspend_prepare = 0; > + break; > + } > + return 0; > +} > + > +static struct notifier_block pm_notifier = { .notifier_call = > hwrng_pm_notification }; > static int hwrng_fillfn(void *unused) > { > size_t entropy, entropy_credit = 0; /* in 1/1024 of a bit */ > @@ -478,6 +495,9 @@ static int hwrng_fillfn(void *unused) > unsigned short quality; > struct hwrng *rng; > > + while (is_suspend_prepare) > + msleep(500); > + > rng = get_current_rng(); > if (IS_ERR(rng) || !rng) > break; > @@ -549,6 +569,7 @@ int hwrng_register(struct hwrng *rng) > goto out_unlock; > } > mutex_unlock(&rng_mutex); > + WARN_ON(register_pm_notifier(&pm_notifier)); > return 0; > out_unlock: > mutex_unlock(&rng_mutex); > -- > 2.43.0 > ------------------------ > > And I had passed over 10000 times of s3 wake/sleep aging test. > > Can you make some patches for this issue and merges? > > Thank you, > Mike > > -- > You may reply to this email to add a comment. > > You are receiving this mail because: > You are watching the assignee of the bug. -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette