On Thu, 2020-10-01 at 14:15 -0400, Nayna wrote: > On 10/1/20 12:53 AM, James Bottomley wrote: > > On Thu, 2020-10-01 at 04:50 +0300, Jarkko Sakkinen wrote: > > > On Wed, Sep 30, 2020 at 03:31:20PM -0700, James Bottomley wrote: > > > > On Thu, 2020-10-01 at 00:09 +0300, Jarkko Sakkinen wrote: [...] > > > > > I also wonder if we could adjust the frequency dynamically. > > > > > I.e. start with optimistic value and lower it until finding > > > > > the sweet spot. > > > > > > > > The problem is the way this crashes: the TPM seems to be > > > > unrecoverable. If it were recoverable without a hard reset of > > > > the entire machine, we could certainly play around with it. I > > > > can try alternative mechanisms to see if anything's viable, but > > > > to all intents and purposes, it looks like my TPM simply stops > > > > responding to the TIS interface. > > > > > > A quickly scraped idea probably with some holes in it but I was > > > thinking something like > > > > > > 1. Initially set slow value for latency, this could be the > > > original 15 ms. > > > 2. Use this to read TPM_PT_VENDOR_STRING_*. > > > 3. Lookup based vendor string from a fixup table a latency that > > > works > > > (the fallback latency could be the existing latency). > > > > Well, yes, that was sort of what I was thinking of doing for the > > Atmel ... except I was thinking of using the TIS VID (16 byte > > assigned vendor ID) which means we can get the information to set > > the timeout before we have to do any TPM operations. > > I wonder if the timeout issue exists for all TPM commands for the > same manufacturer. For example, does the ATMEL TPM also crash when > extending PCRs ? > > In addition to defining a per TPM vendor based lookup table for > timeout, would it be a good idea to also define a Kconfig/boot param > option to allow timeout setting. This will enable to set the timeout > based on the specific use. I don't think we need go that far (yet). The timing change has been in upstream since: commit 424eaf910c329ab06ad03a527ef45dcf6a328f00 Author: Nayna Jain <nayna@xxxxxxxxxxxxxxxxxx> Date: Wed May 16 01:51:25 2018 -0400 tpm: reduce polling time to usecs for even finer granularity Which was in the released kernel 4.18: over two years ago. In all that time we've discovered two problems: mine which looks to be an artifact of an experimental upgrade process in a new nuvoton and the Atmel. That means pretty much every other TPM simply works with the existing timings > I was also thinking how will we decide the lookup table values for > each vendor ? I wasn't thinking we would. I was thinking I'd do a simple exception for the Atmel and nothing else. I don't think my Nuvoton is in any way characteristic. Indeed my pluggable TPM rainbow bridge system works just fine with a Nuvoton and the current timings. We can add additional exceptions if they actually turn up. James