On Thu, 2018-02-01 at 20:12 +0000, Mario.Limonciello@xxxxxxxx wrote: > > > > > -----Original Message----- > > From: Paul Menzel [mailto:pmenzel@xxxxxxxxxxxxx] > > Sent: Thursday, February 1, 2018 1:17 PM > > To: James Bottomley <James.Bottomley@xxxxxxxxxxxxxxxxxxxxx> > > Cc: linux-integrity <linux-integrity@xxxxxxxxxxxxxxx>; Limonciello, > > Mario > > <Mario_Limonciello@xxxxxxxx>; regressions@xxxxxxxxxxxxx; Alexander > > Steffen > > <Alexander.Steffen@xxxxxxxxxxxx> > > Subject: Re: TPM selftest failure in 4.15 (Dell XPS 13, Nuvoton > > 6xx) > > > > [resend with regressions@ address fixed, sorry] > > > > Am 01.02.2018 um 20:16 schrieb Paul Menzel: > > > > > > Dear James, > > > > > > > > > Am 01.02.2018 um 16:24 schrieb James Bottomley: > > > > > > > > On Thu, 2018-02-01 at 12:42 +0000, James Bottomley wrote: > > > > > > > > > > On Thu, 2018-02-01 at 13:21 +0100, Paul Menzel wrote: > > > > > > > > > > > > > > > > > > > > > > > > On 02/01/18 13:16, James Bottomley wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > Embarrassingly enough, I'm just on my way to do a TPM > > > > > > > talk at > > > > > > > FOSDEM. I installed my shiny new 4.15 kernel on the > > > > > > > 'plane and > > > > > > > this is what I got after I arrived this morning: > > > > > > > > > > > > > > jejb@jarvis:~> dmesg | grep -i tpm > > > > > > > [ 0.000000] ACPI: TPM2 0x0000000079446CC0 000034 > > > > > > > (v03 Tpm2Tabl 00000001 AMI 00000000) > > > > > > > [ 1.598059] tpm_tis MSFT0101:00: 2.0 TPM (device-id > > > > > > > 0xFE, rev-id 2) > > > > > > > [ 1.608863] tpm tpm0: A TPM error (2314) occurred > > > > > > > continue selftest > > > > > > > [ 1.640052] tpm tpm0: A TPM error (2314) occurred > > > > > > > continue selftest > > > > > > > [ 1.691215] tpm tpm0: A TPM error (2314) occurred > > > > > > > continue selftest > > > > > > > [ 1.782377] tpm tpm0: A TPM error (2314) occurred > > > > > > > continue selftest > > > > > > > [ 1.953539] tpm tpm0: A TPM error (2314) occurred > > > > > > > continue selftest > > > > > > > [ 2.284701] tpm tpm0: A TPM error (2314) occurred > > > > > > > continue selftest > > > > > > > [ 2.935743] tpm tpm0: A TPM error (2314) occurred > > > > > > > continue selftest > > > > > > > [ 4.216236] tpm tpm0: TPM self test failed > > > > > > > [ 4.236829] ima: No TPM chip found, activating TPM- > > > > > > > bypass! (rc=-19) > > > > > > > > > > > > > > The error is TPM_RC_TESTING, which means it looks like we > > > > > > > don't wait long enough for the selftests to complete. I > > > > > > > get this all the time booting with 4.15. Fortunately I > > > > > > > have a 4.13 backup kernel which is fine (otherwise I'd be > > > > > > > a bit hosed since all my keys now require a TPM). > > > > > > > > > > > > > > I'll debug on the train; my current suspicion is that the > > > > > > > TPM_LONG duration might be a bit short for this chip (A > > > > > > > nuvoton 6xx in a dell XPS-13). > > > > > > > > > > > > Please join the thread [1], where I reported the same > > > > > > problem for the Dell XPS 13 9360. Unfortunately, no > > > > > > solution was found, especially, as I did not use the TPM. > > > > > > Other owners of that system unfortunately didn’t have time > > > > > > to report back if it work for them, so the “conclusion” > > > > > > kind of was, that my TPM was broken, and had to be tested. > > > > > > > > > > OK, I'll try to find a fix. It's clearly a marginal problem > > > > > since I've booted most -rc kernels without issue, so there's > > > > > some slight timing change in 4.15 that triggered it. It > > > > > could also be a shutdown issue. Any NV ram stuff deferred to > > > > > start up would take a variable amount of time. > > > > > > > > > > You'd almost think it's some sort of TPM self protest: the > > > > > more stuff I use it for the more problems it seems to create. > > > > > I'm definitely motivated to fix it because without a TPM I > > > > > can't actually do much with my laptop. > > > > > > > > OK, I investigated but now my TPM has returned to normal (as in > > > > it passes the selftest immediately). There's clearly something > > > > that makes it return TPM_RC_TESTING to every self test probe > > > > for seconds at a time, but I don't know what it is. Sending a > > > > different command seems to cause the problem to clear (Managed > > > > to reproduce once with the patch to verify), so this is my > > > > proposed fix. It's clearly nonsensical to detach the driver > > > > because the self test still returns TPM_RC_TESTING, > > > > so convert that return to a TPM_RC_SUCCESS on timeout. It > > > > prints a warning message so we'll see it in the logs if it > > > > causes problems. Given that this seems to be some type of > > > > internal TPM issue, I don't believe changing the timings would > > > > work. > > > > > > Maybe Mario can confirm this issue too, now that Linux 4.15 is > > > released. Maybe he also has a way to get the Nuvoton people > > > involved. > > James, > > Did you actually experiment with changing the timings? No, I already said: waiting 2s for a device driver init is already too great a burden on the boot sequence. I don't honestly think waiting longer would help either ... 2s is a huge amount of time so there's something else going on with the TPM. James > I was told that TPMs that are FIPS validated (such as that in the XPS > 13) may take longer for the self tests to run. > > > > > > > > > > > > > > > > > --- > > > > > > > > diff --git a/drivers/char/tpm/tpm2-cmd.c > > > > b/drivers/char/tpm/tpm2-cmd.c > > > > index f40d20671a78..3e1b062d8888 100644 > > > > --- a/drivers/char/tpm/tpm2-cmd.c > > > > +++ b/drivers/char/tpm/tpm2-cmd.c > > > > @@ -872,6 +872,17 @@ static int tpm2_do_selftest(struct > > > > tpm_chip *chip) > > > > /* wait longer the next round */ > > > > delay_msec *= 2; > > > > } > > > > + if (rc == TPM2_RC_TESTING) { > > > > + /* > > > > + * A return of RC_TESTING means the TPM is still > > > > + * running self tests. If one fails it will go into > > > > + * failure mode and return RC_FAILED to every command, > > > > + * so treat a still in testing return as a success > > > > + * rather than causing a driver detach. > > > > + */ > > > > + dev_err(&chip->dev,"TPM: Still in testing mode after > > > > %dms, > > > > continuing\n", delay_msec); > > > > + rc = TPM2_RC_SUCCESS; > > > > + } > > > > return rc; > > > > } > > > > > > Alexander replied the following in the other thread. No idea if > > > you read > > > it yet. > > > > > > > > > > > The list of "A TPM error (2314) occurred continue selftest" is > > > > caused > > > > by my commit 125a2210541079e8e7c69e629ad06cabed788f8c ("tpm: > > > > React > > > > correctly to > > > > RC_TESTING from TPM 2.0 self tests") [1]. 2314 is > > > > TPM_RC_TESTING, so > > > > the TPM > > > > tells us that self-tests are still running in the background. > > > > This > > > > problem was > > > > not visible in previous versions, since it (incorrectly) > > > > ignored > > > > > TPM_RC_TESTING. > > > > > > Maybe the commit should be reverted for now until this has > > > cleared up > > > for the Dell XPS 13 9360(?) to adhere to Linux’ no regression > > > policy. > > > > > > > > > Kind regards, > > > > > > Paul > > > > > > > > > PS: Alexander will also be at FOSDEM and mentioned your talk [2]. > > > > > > > > > [1] > > > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/ > > commit?id=125 > > a2210541079e8e7c69e629ad06cabed788f8[2] > > > > > > https://lists.01.org/pipermail/tpm2/2018-January/000486.html