On Wed, Feb 03, 2021 at 12:27:40AM +0200, Jarkko Sakkinen wrote: > On Tue, Feb 02, 2021 at 09:58:24AM -0800, James Bottomley wrote: > > On Tue, 2021-02-02 at 11:26 -0600, Serge E. Hallyn wrote: > > > On Tue, Feb 02, 2021 at 05:33:17PM +0200, jarkko@xxxxxxxxxx wrote: > > > > From: Jarkko Sakkinen <jarkko@xxxxxxxxxx> > > > > > > > > An unexpected status from TPM chip is not irrecovable failure of > > > > the > > > > kernel. It's only undesirable situation. Thus, change the WARN_ONCE > > > > instance inside tpm_tis_status() to pr_warn_once(). > > > > > > > > In addition: print the status in the log message because it is > > > > actually > > > > useful information lacking from the existing log message. > > > > > > > > Suggested-by: Guenter Roeck <linux@xxxxxxxxxxxx> > > > > Cc: stable@xxxxxxxxxxxxxxx > > > > Fixes: 6f4f57f0b909 ("tpm: ibmvtpm: fix error return code in > > > > tpm_ibmvtpm_probe()") > > > > Signed-off-by: Jarkko Sakkinen <jarkko@xxxxxxxxxx> > > > > --- > > > > drivers/char/tpm/tpm_tis_core.c | 2 +- > > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > > > diff --git a/drivers/char/tpm/tpm_tis_core.c > > > > b/drivers/char/tpm/tpm_tis_core.c > > > > index 431919d5f48a..21f67c6366cb 100644 > > > > --- a/drivers/char/tpm/tpm_tis_core.c > > > > +++ b/drivers/char/tpm/tpm_tis_core.c > > > > @@ -202,7 +202,7 @@ static u8 tpm_tis_status(struct tpm_chip *chip) > > > > * acquired. Usually because tpm_try_get_ops() hasn't > > > > * been called before doing a TPM operation. > > > > */ > > > > - WARN_ONCE(1, "TPM returned invalid status\n"); > > > > + pr_warn_once("TPM returned invalid status: 0x%x\n", > > > > status); > > > > return 0; > > > > } > > > > > > Actually in this case I don't understand why _once, especially based > > > on the comment. Would ratelimited not be better? So we can see if > > > it happens repeatedly? Even better would be if we could see when it > > > next gave a valid status after an invalid one. > > > > The reason was that we're trying to catch and kill paths to the status > > where the locality is incorrect. If you do some operation that finds > > an incorrect path the likelihood is you'll exercise it more than once, > > but all we need to identify it is the call trace from a single access. > > The symptom the user space process sees is a TPM timeout, but we still > > have the in-kernel trace to tell us why. > > I don't agree with this reasoning. This warn could spun off also from chip > not following TCG spec. The patch does provide the status code, which is > always useful information. I.e. WARN() implies usually that there is something wrong in the kernel state risking its stability which *is not* case here. Thus, it's best to make the status code visible, not the stack trace, and make rest of the conclusions from that. /Jarkko