Re: [PATCH] tpm_tis: work around status register bug in STMicroelectronics TPM

Jarkko Sakkinen <jarkko.sakkinen@xxxxxxxxxxxxxxx> · Mon, 20 Apr 2020 23:46:41 +0300

On Fri, Apr 17, 2020 at 05:12:28PM -0700, James Bottomley wrote:
> On Sat, 2020-04-18 at 02:55 +0300, Jarkko Sakkinen wrote:
> > On Thu, Apr 16, 2020 at 11:02:51AM -0700, James Bottomley wrote:
> > > On Wed, 2020-04-15 at 17:24 -0700, Omar Sandoval wrote:
> > > > On Wed, Apr 15, 2020 at 05:16:05PM -0700, Omar Sandoval wrote:
> > > > > On Wed, Apr 15, 2020 at 04:51:39PM -0700, James Bottomley
> > > > > wrote:
> > > > > > On Wed, 2020-04-15 at 15:45 -0700, Omar Sandoval wrote:
> > > > > > > From: Omar Sandoval <osandov@xxxxxx>
> > > > > > > 
> > > > > > > We've encountered a particular model of STMicroelectronics
> > > > > > > TPM
> > > > > > > that transiently returns a bad value in the status
> > > > > > > register.
> > > > > > > This causes the kernel to believe that the TPM is ready to
> > > > > > > receive a command when it actually isn't, which in turn
> > > > > > > causes
> > > > > > > the send to time out in get_burstcount(). In testing,
> > > > > > > reading
> > > > > > > the status register one extra time convinces the TPM to
> > > > > > > return
> > > > > > > a valid value.
> > > > > > 
> > > > > > Interesting, I've got a very early upgradeable nuvoton that
> > > > > > seems
> > > > > > to be behaving like this.
> > > > > 
> > > > > I'll attach the userspace reproducer I used to figure this out.
> > > > > I'd
> > > > > be interested to see if it times out on your TPM, too. Note
> > > > > that it
> > > > > bangs on /dev/mem and assumes that the MMIO address is
> > > > > 0xfed40000.
> > > > > That seems to be the hard-coded address for x86 in the kernel,
> > > > > but
> > > > > just to be safe you might want to check `grep MSFT0101
> > > > > /proc/iomem`.
> > > > 
> > > > Forgot to attach it, of course...
> > > 
> > > 
> > > Thanks!  You facebook guys run with interesting kernel options ...
> > > I
> > > eventually had to disable CONFIG_STRICT_DEVMEM and rebuild my
> > > kernel to
> > > get it to run.
> > > 
> > > However, the bad news is that this isn't my problem, it seems to be
> > > more timeout related  I get the same symptoms: logs full of
> > > 
> > > [14570.626594] tpm tpm0: tpm_try_transmit: tpm_send: error -62
> > > 
> > > and the TPM won't recover until the box is reset.  To get my TPM to
> > > be
> > > usable, I have to fiddle our default timeouts like this:
> > > 
> > > --- a/drivers/char/tpm/tpm.h
> > > +++ b/drivers/char/tpm/tpm.h
> > > @@ -41,8 +41,8 @@ enum tpm_timeout {
> > >         TPM_TIMEOUT_RETRY = 100, /* msecs */
> > >         TPM_TIMEOUT_RANGE_US = 300,     /* usecs */
> > >         TPM_TIMEOUT_POLL = 1,   /* msecs */
> > > -       TPM_TIMEOUT_USECS_MIN = 100,      /* usecs */
> > > -       TPM_TIMEOUT_USECS_MAX = 500      /* usecs */
> > > +       TPM_TIMEOUT_USECS_MIN = 750,      /* usecs */
> > > +       TPM_TIMEOUT_USECS_MAX = 1000,      /* usecs */
> > >  };
> > > 
> > > But I think the problem is unique to my nuvoton because there
> > > haven't
> > > been any other reports of problems like this ... and with these
> > > timeouts my system functions normally in spite of me being a heavy
> > > TPM
> > > user.
> > 
> > What downsides there would be to increase these a bit?
> 
> PCR writes would take longer meaning IMA initialization would become
> slower.

Does it matter?

/Jarkko