On Fri, Aug 28, 2020 at 08:12:55PM -0400, Jason Andryuk wrote: > On Fri, Aug 28, 2020 at 7:18 PM Jarkko Sakkinen > <jarkko.sakkinen@xxxxxxxxxxxxxxx> wrote: > > > > On Thu, Aug 27, 2020 at 11:24:45AM -0400, Jason Andryuk wrote: > > > James Bottomley wrote: > > > >On Wed, 2020-04-15 at 15:45 -0700, Omar Sandoval wrote: > > > >> From: Omar Sandoval <osandov@xxxxxx> > > > >> > > > >> We've encountered a particular model of STMicroelectronics TPM that > > > >> transiently returns a bad value in the status register. This causes > > > >> the kernel to believe that the TPM is ready to receive a command when > > > >> it actually isn't, which in turn causes the send to time out in > > > >> get_burstcount(). In testing, reading the status register one extra > > > >> time convinces the TPM to return a valid value. > > > > > > > >Interesting, I've got a very early upgradeable nuvoton that seems to be > > > >behaving like this. > > > > > > > >> Signed-off-by: Omar Sandoval <osandov@xxxxxx> > > > >> --- > > > >> drivers/char/tpm/tpm_tis_core.c | 12 ++++++++++++ > > > >> 1 file changed, 12 insertions(+) > > > >> > > > >> diff --git a/drivers/char/tpm/tpm_tis_core.c > > > >> b/drivers/char/tpm/tpm_tis_core.c > > > >> index 27c6ca031e23..277a21027fc7 100644 > > > >> --- a/drivers/char/tpm/tpm_tis_core.c > > > >> +++ b/drivers/char/tpm/tpm_tis_core.c > > > >> @@ -238,6 +238,18 @@ static u8 tpm_tis_status(struct tpm_chip *chip) > > > >> rc = tpm_tis_read8(priv, TPM_STS(priv->locality), &status); > > > >> if (rc < 0) > > > >> return 0; > > > >> + /* > > > >> + * Some STMicroelectronics TPMs have a bug where the status > > > >> register is > > > >> + * sometimes bogus (all 1s) if read immediately after the > > > >> access > > > >> + * register is written to. Bits 0, 1, and 5 are always > > > >> supposed to read > > > >> + * as 0, so this is clearly invalid. Reading the register a > > > >> second time > > > >> + * returns a valid value. > > > >> + */ > > > >> + if (unlikely(status == 0xff)) { > > > >> + rc = tpm_tis_read8(priv, TPM_STS(priv->locality), > > > >> &status); > > > >> + if (rc < 0) > > > >> + return 0; > > > >> + } > > > > > > > >You theorize that your case is fixed by the second read, but what if it > > > >isn't and the second read also returns 0xff? Shouldn't we have a line > > > >here saying > > > > > > > >if (unlikely(status == 0xff)) > > > > status = 0; > > > > > > > >So if we get a second 0xff we just pretend the thing isn't ready? > > > > > > Thanks for the fix, Omar! > > > > > > I tried the patch and it helps with STM TPM2 issues where commands fail > > > with the kernel reporting: > > > tpm tpm0: Unable to read burstcount > > > tpm tpm0: tpm_try_transmit: send(): error -16 > > > > > > My testing was with 5.4, and I'd like to see this CC-ed stable. > > > > > > When trying to diagnose the issue before finding this patch, I found it > > > was timing sensitive. I was seeing failures in the OpenXT installer. > > > The system is basically idle when issuing TPM commands which frequently > > > failed. The same hardware booted into a Fedora Live USB image didn't > > > have any TPM command failures. One notable difference between the two > > > is Fedora is CONFIG_PREEMPT=y and OpenXT is CONFIG_PREEMPT_NONE=y. > > > Switching OpenXT to PREEMPT=y helped some, but there were still some > > > issues with commands failing. The second interesting thing was running tpm > > > commands in OpenXT under trace-cmd let them succeed. I guess that was enough > > > to throw the timing off. > > > > > > Anyway, I'd like to see this patch applied, please. > > > > > > Thanks, > > > Jason > > > > There was v2 sent after this: > > > > https://patchwork.kernel.org/patch/11492125/ > > Thanks! That one didn't come up in a search for STM on lore.kernel.org. > > > Unfortunately it lacks changelog. What was changed between v1 and v2? > > Why v3 has not been sent yet? I see a discussion with no final > > conclusion. > > Looks like v2 added James's suggestion with a comment (sorry the > formating is off): > > + /* > + * The status is somehow still bad. This hasn't been observed in > + * practice, but clear it just in case so that it doesn't appear > + * to be ready. > + */ > + if (unlikely(status == 0xff)) > + status = 0; > > But, yes, the decision on the alternate approach is unresolved. > > Thanks again, > Jason I'm happy to apply this patch as soon as there is either v3 or some resolution to v2 discussion. /Jarkko