On Fri, Aug 28, 2020 at 7:18 PM Jarkko Sakkinen <jarkko.sakkinen@xxxxxxxxxxxxxxx> wrote: > > On Thu, Aug 27, 2020 at 11:24:45AM -0400, Jason Andryuk wrote: > > James Bottomley wrote: > > >On Wed, 2020-04-15 at 15:45 -0700, Omar Sandoval wrote: > > >> From: Omar Sandoval <osandov@xxxxxx> > > >> > > >> We've encountered a particular model of STMicroelectronics TPM that > > >> transiently returns a bad value in the status register. This causes > > >> the kernel to believe that the TPM is ready to receive a command when > > >> it actually isn't, which in turn causes the send to time out in > > >> get_burstcount(). In testing, reading the status register one extra > > >> time convinces the TPM to return a valid value. > > > > > >Interesting, I've got a very early upgradeable nuvoton that seems to be > > >behaving like this. > > > > > >> Signed-off-by: Omar Sandoval <osandov@xxxxxx> > > >> --- > > >> drivers/char/tpm/tpm_tis_core.c | 12 ++++++++++++ > > >> 1 file changed, 12 insertions(+) > > >> > > >> diff --git a/drivers/char/tpm/tpm_tis_core.c > > >> b/drivers/char/tpm/tpm_tis_core.c > > >> index 27c6ca031e23..277a21027fc7 100644 > > >> --- a/drivers/char/tpm/tpm_tis_core.c > > >> +++ b/drivers/char/tpm/tpm_tis_core.c > > >> @@ -238,6 +238,18 @@ static u8 tpm_tis_status(struct tpm_chip *chip) > > >> rc = tpm_tis_read8(priv, TPM_STS(priv->locality), &status); > > >> if (rc < 0) > > >> return 0; > > >> + /* > > >> + * Some STMicroelectronics TPMs have a bug where the status > > >> register is > > >> + * sometimes bogus (all 1s) if read immediately after the > > >> access > > >> + * register is written to. Bits 0, 1, and 5 are always > > >> supposed to read > > >> + * as 0, so this is clearly invalid. Reading the register a > > >> second time > > >> + * returns a valid value. > > >> + */ > > >> + if (unlikely(status == 0xff)) { > > >> + rc = tpm_tis_read8(priv, TPM_STS(priv->locality), > > >> &status); > > >> + if (rc < 0) > > >> + return 0; > > >> + } > > > > > >You theorize that your case is fixed by the second read, but what if it > > >isn't and the second read also returns 0xff? Shouldn't we have a line > > >here saying > > > > > >if (unlikely(status == 0xff)) > > > status = 0; > > > > > >So if we get a second 0xff we just pretend the thing isn't ready? > > > > Thanks for the fix, Omar! > > > > I tried the patch and it helps with STM TPM2 issues where commands fail > > with the kernel reporting: > > tpm tpm0: Unable to read burstcount > > tpm tpm0: tpm_try_transmit: send(): error -16 > > > > My testing was with 5.4, and I'd like to see this CC-ed stable. > > > > When trying to diagnose the issue before finding this patch, I found it > > was timing sensitive. I was seeing failures in the OpenXT installer. > > The system is basically idle when issuing TPM commands which frequently > > failed. The same hardware booted into a Fedora Live USB image didn't > > have any TPM command failures. One notable difference between the two > > is Fedora is CONFIG_PREEMPT=y and OpenXT is CONFIG_PREEMPT_NONE=y. > > Switching OpenXT to PREEMPT=y helped some, but there were still some > > issues with commands failing. The second interesting thing was running tpm > > commands in OpenXT under trace-cmd let them succeed. I guess that was enough > > to throw the timing off. > > > > Anyway, I'd like to see this patch applied, please. > > > > Thanks, > > Jason > > There was v2 sent after this: > > https://patchwork.kernel.org/patch/11492125/ Thanks! That one didn't come up in a search for STM on lore.kernel.org. > Unfortunately it lacks changelog. What was changed between v1 and v2? > Why v3 has not been sent yet? I see a discussion with no final > conclusion. Looks like v2 added James's suggestion with a comment (sorry the formating is off): + /* + * The status is somehow still bad. This hasn't been observed in + * practice, but clear it just in case so that it doesn't appear + * to be ready. + */ + if (unlikely(status == 0xff)) + status = 0; But, yes, the decision on the alternate approach is unresolved. Thanks again, Jason