On Wed, 2020-09-30 at 17:01 -0700, Jerry Snitselaar wrote: > James Bottomley @ 2020-09-30 16:03 MST: > > > On Wed, 2020-09-30 at 14:19 -0700, Jerry Snitselaar wrote: > > > James Bottomley @ 2020-09-29 15:32 MST: > > > > > > > The current release locality code seems to be based on the > > > > misunderstanding that the TPM interrupts when a locality is > > > > released: it doesn't, only when the locality is acquired. > > > > > > > > Furthermore, there seems to be no point in waiting for the > > > > locality to be released. All it does is penalize the last TPM > > > > user. However, if there's no next TPM user, this is a > > > > pointless wait and if there is a next TPM user, they'll pay the > > > > penalty waiting for the new locality (or possibly not if it's > > > > the same as the old locality). > > > > > > > > Fix the code by making release_locality as simple write to > > > > release with no waiting for completion. > > [...] > > > My recollection is that this was added because there were some > > > chips that took so long to release locality that a subsequent > > > request_locality call was seeing the locality as already active, > > > moving on, and then the locality was getting released out from > > > under the user. > > > > Well, I could simply dump the interrupt code, which can never work > > and we could always poll. > > > > However, there also appears to be a bug in our locality requesting > > code. We write the request and wait for the grant, but a grant > > should be signalled by not only the ACCESS_ACTIVE_LOCALITY being 1 > > but also the ACCESS_REQUEST_USE going to 0. As you say, if we're > > slow to relinquish, ACCESS_ACTIVE_LOCALITY could already be 1 and > > we'd think we were granted, but ACCESS_REQUEST_USE should stay 1 > > until the TPM actually grants the next request. > > > > If I code up a fix is there any chance you still have access to a > > problem TPM? Mine all seem to grant and release localities fairly > > instantaneously. > > > > James > > Sorry, I seemed to make a mess of it. I don't have access to a system > where it occurred, but cc'ing Laurent since he reported the problem > and might still have access to the system. > > I'd say fix up the check for locality request to look at > ACCESS_REQUEST_USE, and go with this patch to clean up locality > release. Hopefully Laurent still has access and can test. I do have a > laptop now where I should be able to test the other bits in your > patchset since this is one of the models that hit interrupt storm > problem when Stefan's 2 patches were originally applied. Lenovo > applied a fix to their bios, but this should still have the older one > version that has the issue. I'm on PTO this week, but I will try to > spend some time in the next couple days reproducing and then trying > your patches. Thanks. I think the patch to fix to request access is very simple ... it's just to check the request bit has gone to zero, so I've attached it below. It seems to work fine for me. James --- diff --git a/drivers/char/tpm/tpm_tis_core.c b/drivers/char/tpm/tpm_tis_core.c index 0a86cf392466..5e56e8c67791 100644 --- a/drivers/char/tpm/tpm_tis_core.c +++ b/drivers/char/tpm/tpm_tis_core.c @@ -168,7 +168,8 @@ static bool check_locality(struct tpm_chip *chip, int l) if (rc < 0) return false; - if ((access & (TPM_ACCESS_ACTIVE_LOCALITY | TPM_ACCESS_VALID)) == + if ((access & (TPM_ACCESS_ACTIVE_LOCALITY | TPM_ACCESS_VALID + | TPM_ACCESS_REQUEST_USE)) == (TPM_ACCESS_ACTIVE_LOCALITY | TPM_ACCESS_VALID)) { priv->locality = l; return true;