> On Dec 8, 2020, at 18:17, Jarkko Sakkinen <jarkko@xxxxxxxxxx> wrote: > > On Mon, Dec 07, 2020 at 12:42:53PM +0800, Kai-Heng Feng wrote: >> Hi Jarkko, >> >> A user report that the system can only do S3 once. Subsequent S3 fails after commit a3fbfae82b4c ("tpm: take TPM chip power gating out of tpm_transmit()"). >> >> Dmesg with the issue, collected under 5.10-rc2: >> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1891502/comments/14 >> >> Dmesg without the issue, collected under 5.0.0-rc8: >> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1891502/comments/16 >> >> Full bug report here: >> https://bugs.launchpad.net/bugs/1891502 >> >> Kai-Heng > > Relevant part: > > > [80601.620149] tpm tpm0: Error (28) sending savestate before suspend > [80601.620165] PM: __pnp_bus_suspend(): tpm_pm_suspend+0x0/0x90 returns 28 > [80601.620172] PM: dpm_run_callback(): pnp_bus_suspend+0x0/0x20 returns 28 > [80601.620178] PM: Device 00:01 failed to suspend: error 28 > > Looking at this there are two issues: > > A. TPM_ORD_SAVESTATE command failing, this a new regression. > B. When tpm_pm_suspend() fails, it should not fail the whole suspend > procedure. And it returns the TPM error code back to the upper > layers when it does so, which makes no sense. This is an old > issue revealed by A. > > Let's look at tpm_pm_suspend(): > > /* > * We are about to suspend. Save the TPM state > * so that it can be restored. > */ > int tpm_pm_suspend(struct device *dev) > { > struct tpm_chip *chip = dev_get_drvdata(dev); > int rc = 0; > > if (!chip) > return -ENODEV; > > if (chip->flags & TPM_CHIP_FLAG_ALWAYS_POWERED) > goto suspended; > > if ((chip->flags & TPM_CHIP_FLAG_FIRMWARE_POWER_MANAGED) && > !pm_suspend_via_firmware()) > goto suspended; > > if (!tpm_chip_start(chip)) { > if (chip->flags & TPM_CHIP_FLAG_TPM2) > tpm2_shutdown(chip, TPM2_SU_STATE); > else > rc = tpm1_pm_suspend(chip, tpm_suspend_pcr); > > tpm_chip_stop(chip); > } > > suspended: > return rc; > } > EXPORT_SYMBOL_GPL(tpm_pm_suspend); > > I would modify this into: > > /* > * We are about to suspend. Save the TPM state > * so that it can be restored. > */ > int tpm_pm_suspend(struct device *dev) > { > struct tpm_chip *chip = dev_get_drvdata(dev); > int rc = 0; > > if (!chip) > return -ENODEV; > > if (chip->flags & TPM_CHIP_FLAG_ALWAYS_POWERED) > goto suspended; > > if ((chip->flags & TPM_CHIP_FLAG_FIRMWARE_POWER_MANAGED) && > !pm_suspend_via_firmware()) > goto suspended; > > if (!tpm_chip_start(chip)) { > if (chip->flags & TPM_CHIP_FLAG_TPM2) > tpm2_shutdown(chip, TPM2_SU_STATE); > else > tpm1_pm_suspend(chip, tpm_suspend_pcr); > > tpm_chip_stop(chip); > } > > suspended: > return rc; > } > EXPORT_SYMBOL_GPL(tpm_pm_suspend); > > I.e. it's a good idea to put something into klog but that should not > fail the whole suspend procedure. TPM is essentially opt-in feature. > > Of course issue A needs to be also sorted out but would this work as > a quick initial fix? I can queue a patch for this. Is it possible to > try out this fix for if I drop a patch? Yes, possible test result from affected user. I had to cut those code and do a diff side by side to find what changed. Hopefully next time I can get one from `git diff`... Kai-Heng > > /Jarkko