> On Oct 17, 2020, at 10:09 PM, Jarkko Sakkinen <jarkko.sakkinen@xxxxxxxxxxxxxxx> wrote: > > On Fri, Oct 16, 2020 at 11:11:37PM -0700, Hao Wu wrote: >>> On Oct 1, 2020, at 4:04 PM, Jarkko Sakkinen <jarkko.sakkinen@xxxxxxxxxxxxxxx> wrote: >>> >>> On Thu, Oct 01, 2020 at 11:32:59AM -0700, James Bottomley wrote: >>>> On Thu, 2020-10-01 at 14:15 -0400, Nayna wrote: >>>>> On 10/1/20 12:53 AM, James Bottomley wrote: >>>>>> On Thu, 2020-10-01 at 04:50 +0300, Jarkko Sakkinen wrote: >>>>>>> On Wed, Sep 30, 2020 at 03:31:20PM -0700, James Bottomley wrote: >>>>>>>> On Thu, 2020-10-01 at 00:09 +0300, Jarkko Sakkinen wrote: >>>> [...] >>>>>>>>> I also wonder if we could adjust the frequency dynamically. >>>>>>>>> I.e. start with optimistic value and lower it until finding >>>>>>>>> the sweet spot. >>>>>>>> >>>>>>>> The problem is the way this crashes: the TPM seems to be >>>>>>>> unrecoverable. If it were recoverable without a hard reset of >>>>>>>> the entire machine, we could certainly play around with it. I >>>>>>>> can try alternative mechanisms to see if anything's viable, but >>>>>>>> to all intents and purposes, it looks like my TPM simply stops >>>>>>>> responding to the TIS interface. >>>>>>> >>>>>>> A quickly scraped idea probably with some holes in it but I was >>>>>>> thinking something like >>>>>>> >>>>>>> 1. Initially set slow value for latency, this could be the >>>>>>> original 15 ms. >>>>>>> 2. Use this to read TPM_PT_VENDOR_STRING_*. >>>>>>> 3. Lookup based vendor string from a fixup table a latency that >>>>>>> works >>>>>>> (the fallback latency could be the existing latency). >>>>>> >>>>>> Well, yes, that was sort of what I was thinking of doing for the >>>>>> Atmel ... except I was thinking of using the TIS VID (16 byte >>>>>> assigned vendor ID) which means we can get the information to set >>>>>> the timeout before we have to do any TPM operations. >>>>> >>>>> I wonder if the timeout issue exists for all TPM commands for the >>>>> same manufacturer. For example, does the ATMEL TPM also crash when >>>>> extending PCRs ? >>>>> >>>>> In addition to defining a per TPM vendor based lookup table for >>>>> timeout, would it be a good idea to also define a Kconfig/boot param >>>>> option to allow timeout setting. This will enable to set the timeout >>>>> based on the specific use. >>>> >>>> I don't think we need go that far (yet). The timing change has been in >>>> upstream since: >>>> >>>> commit 424eaf910c329ab06ad03a527ef45dcf6a328f00 >>>> Author: Nayna Jain <nayna@xxxxxxxxxxxxxxxxxx> >>>> Date: Wed May 16 01:51:25 2018 -0400 >>>> >>>> tpm: reduce polling time to usecs for even finer granularity >>>> >>>> Which was in the released kernel 4.18: over two years ago. In all that >>>> time we've discovered two problems: mine which looks to be an artifact >>>> of an experimental upgrade process in a new nuvoton and the Atmel. >>>> That means pretty much every other TPM simply works with the existing >>>> timings >>>> >>>>> I was also thinking how will we decide the lookup table values for >>>>> each vendor ? >>>> >>>> I wasn't thinking we would. I was thinking I'd do a simple exception >>>> for the Atmel and nothing else. I don't think my Nuvoton is in any way >>>> characteristic. Indeed my pluggable TPM rainbow bridge system works >>>> just fine with a Nuvoton and the current timings. >>>> >>>> We can add additional exceptions if they actually turn up. >>> >>> I'd add a table and fallback. >>> >> >> Hi folks, >> >> I want to follow up this a bit and check whether we reached a consensus >> on how to fix the timeout issue for Atmel chip. >> >> Should we revert the changes or introduce the lookup table for chips. >> >> Is there anything I can help from Rubrik side. >> >> Thanks >> Hao > > There is nothing to revert as the previous was not applied but I'm > of course ready to review any new attempts. > Hi Jarkko, By “revert” I meant we revert the timeout value changes by applying the patch I proposed, as the timeout value discussed does cause issues. Why don’t we apply the patch and improve the perf in the way of not breaking TPMs ? Hao