Resending following email in plaintext. ---- Hi James, Thanks for following up. We have actually tried change TPM_TIMEOUT_USECS_MIN / TPM_TIMEOUT_USECS_MAX according to https://patchwork.kernel.org/patch/10520247/ It does not solve the problem for ATMEL chip. The chips facing crash is not experimental, but happens commonly in the production systems we and our customers are using. It is widely found in Cisco 220 / 240 systems which are using Ateml chips. Thanks Hao > On Sep 26, 2020, at 3:57 PM, James Bottomley <James.Bottomley@xxxxxxxxxxxxxxxxxxxxx> wrote: > > On Sat, 2020-09-26 at 15:31 -0700, Hao Wu wrote: >> Since kernel 4.14, we fixed the TPM sleep logic >> from msleep to usleep_range, so that the TPM >> sleeps exactly with TPM_TIMEOUT (=5ms) afterward. >> Before that fix, msleep(5) actually sleeps for >> around 15ms. >> The fix is >> https://github.com/torvalds/linux/commit/9f3fc7bcddcb51234e23494531f93ab60475e1c3 >> >> That fix uncovered that the TPM_TIMEOUT was not properly >> set previously. We recently found the TPM driver in kernel 4.14+ >> (including 5.9-rc4) crashes Atmel TPM chips with >> too frequent TPM queries. > > I've seen this with my nuvoton too ... although it seems to be because > my chip is somewhat experimental (SW upgrade from 1.2 to 2.0). The > problem with your patch is it reintroduces the massive delays that > msleep has ... that's why usleep was used. The patch I use locally to > fix this keeps usleep, can you try it (attached). > > James > > --- > > From d40a8c7691a72de28ea66a78bd177db36a79710a Mon Sep 17 00:00:00 2001 > From: James Bottomley <James.Bottomley@xxxxxxxxxxxxxxxxxxxxx> > Date: Wed, 11 Jul 2018 10:11:14 -0700 > Subject: [PATCH] tpm.h: increase poll timings to fix tpm_tis regression > > tpm_tis regressed recently to the point where the TPM being driven by > it falls off the bus and cannot be contacted after some hours of use. > This is the failure trace: > > jejb@jarvis:~> dmesg|grep tpm > [ 3.282605] tpm_tis MSFT0101:00: 2.0 TPM (device-id 0xFE, rev-id 2) > [14566.626614] tpm tpm0: Operation Timed out > [14566.626621] tpm tpm0: tpm2_load_context: failed with a system error -62 > [14568.626607] tpm tpm0: tpm_try_transmit: tpm_send: error -62 > [14570.626594] tpm tpm0: tpm_try_transmit: tpm_send: error -62 > [14570.626605] tpm tpm0: tpm2_load_context: failed with a system error -62 > [14572.626526] tpm tpm0: tpm_try_transmit: tpm_send: error -62 > [14577.710441] tpm tpm0: tpm_try_transmit: tpm_send: error -62 > ... > > The problem is caused by a change that caused us to poke the TPM far > more often to see if it's ready. Apparently something about the bus > its on and the TPM means that it crashes or falls off the bus if you > poke it too often and once this happens, only a reboot will recover > it. > > The fix I've come up with is to adjust the timings so the TPM no > longer falls of the bus. Obviously, this fix works for my Nuvoton > NPCT6xxx but that's the only TPM I've tested it with. > > Fixes: 424eaf910c32 tpm: reduce polling time to usecs for even finer granularity > Signed-off-by: James Bottomley <James.Bottomley@xxxxxxxxxxxxxxxxxxxxx> > --- > drivers/char/tpm/tpm.h | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/drivers/char/tpm/tpm.h b/drivers/char/tpm/tpm.h > index 947d1db0a5cc..e4f4b98418ab 100644 > --- a/drivers/char/tpm/tpm.h > +++ b/drivers/char/tpm/tpm.h > @@ -41,8 +41,8 @@ enum tpm_timeout { > TPM_TIMEOUT_RETRY = 100, /* msecs */ > TPM_TIMEOUT_RANGE_US = 300, /* usecs */ > TPM_TIMEOUT_POLL = 1, /* msecs */ > - TPM_TIMEOUT_USECS_MIN = 100, /* usecs */ > - TPM_TIMEOUT_USECS_MAX = 500 /* usecs */ > + TPM_TIMEOUT_USECS_MIN = 750, /* usecs */ > + TPM_TIMEOUT_USECS_MAX = 1000, /* usecs */ > }; > > /* TPM addresses */ > -- > 2.26.2 > >