> On Jul 9, 2021, at 12:23 PM, Hao Wu <hao.wu@xxxxxxxxxx> wrote: > >> On Jul 9, 2021, at 10:47 AM, Jarkko Sakkinen <jarkko@xxxxxxxxxx> wrote: >> >> On Thu, Jul 08, 2021 at 09:40:28PM -0700, Hao Wu wrote: >>> The Atmel TPM 1.2 chips crash with error >>> `tpm_try_transmit: send(): error -62` since kernel 4.14. >>> It is observed from the kernel log after running `tpm_sealdata -z`. >>> The error thrown from the command is as follows >>> ``` >>> $ tpm_sealdata -z >>> Tspi_Key_LoadKey failed: 0x00001087 - layer=tddl, >>> code=0087 (135), I/O error >>> ``` >>> >>> The issue was reproduced with the following Atmel TPM chip: >>> ``` >>> $ tpm_version >>> T0 TPM 1.2 Version Info: >>> Chip Version: 1.2.66.1 >>> Spec Level: 2 >>> Errata Revision: 3 >>> TPM Vendor ID: ATML >>> TPM Version: 01010000 >>> Manufacturer Info: 41544d4c >>> ``` >>> >>> The root cause of the issue is due to the TPM calls to msleep() >>> were replaced with usleep_range() [1], which reduces >>> the actual timeout. Via experiments, it is observed that >>> the original msleep(5) actually sleeps for 15ms. >>> Because of a known timeout issue in Atmel TPM 1.2 chip, >>> the shorter timeout than 15ms can cause the error described above. >>> >>> A few further changes in kernel 4.16 [2] and 4.18 [3, 4] further >>> reduced the timeout to less than 1ms. With experiments, >>> the problematic timeout in the latest kernel is the one >>> for `wait_for_tpm_stat`. >>> >>> To fix it, the patch reverts the timeout of `wait_for_tpm_stat` >>> to 15ms for all Atmel TPM 1.2 chips, but leave it untouched >>> for Ateml TPM 2.0 chip, and chips from other vendors. >>> As explained above, the chosen 15ms timeout is >>> the actual timeout before this issue introduced, >>> thus the old value is used here. >>> Particularly, TPM_ATML_TIMEOUT_WAIT_STAT_MIN is set to 14700us, >>> TPM_ATML_TIMEOUT_WAIT_STAT_MIN is set to 15000us according to >>> the existing TPM_TIMEOUT_RANGE_US (300us). >>> The fixed has been tested in the system with the affected Atmel chip >>> with no issues observed after boot up. >>> >>> References: >>> [1] 9f3fc7bcddcb tpm: replace msleep() with usleep_range() in TPM >>> 1.2/2.0 generic drivers >>> [2] cf151a9a44d5 tpm: reduce tpm polling delay in tpm_tis_core >>> [3] 59f5a6b07f64 tpm: reduce poll sleep time in tpm_transmit() >>> [4] 424eaf910c32 tpm: reduce polling time to usecs for even finer >>> granularity >>> >>> Fixes: 9f3fc7bcddcb ("tpm: replace msleep() with usleep_range() in TPM 1.2/2.0 generic drivers") >>> Link: https://patchwork.kernel.org/project/linux-integrity/patch/20200926223150.109645-1-hao.wu@xxxxxxxxxx/ >>> Signed-off-by: Hao Wu <hao.wu@xxxxxxxxxx> >>> --- >>> This version (v2) has following changes on top of the last (v1): >>> - follow the existing way to define two timeouts (min and max) >>> for ATMEL chip, thus keep the exact timeout logic for >>> non-ATEML chips. >>> - limit the timeout increase to only ATMEL TPM 1.2 chips, >>> because it is not an issue for TPM 2.0 chips yet. >>> >>> Test Plan: >>> - Run fixed kernel with ATMEL TPM chips and see crash has been fixed. >>> - Run fixed kernel with non-ATMEL TPM chips, and confirm >>> the timeout has not been changed. >>> >>> drivers/char/tpm/tpm.h | 6 ++++-- >>> drivers/char/tpm/tpm_tis_core.c | 23 +++++++++++++++++++++-- >>> include/linux/tpm.h | 3 +++ >>> 3 files changed, 28 insertions(+), 4 deletions(-) >>> >>> diff --git a/drivers/char/tpm/tpm.h b/drivers/char/tpm/tpm.h >>> index 283f78211c3a..6de1b44c4aab 100644 >>> --- a/drivers/char/tpm/tpm.h >>> +++ b/drivers/char/tpm/tpm.h >>> @@ -41,8 +41,10 @@ enum tpm_timeout { >>> TPM_TIMEOUT_RETRY = 100, /* msecs */ >>> TPM_TIMEOUT_RANGE_US = 300, /* usecs */ >>> TPM_TIMEOUT_POLL = 1, /* msecs */ >>> - TPM_TIMEOUT_USECS_MIN = 100, /* usecs */ >>> - TPM_TIMEOUT_USECS_MAX = 500 /* usecs */ >>> + TPM_TIMEOUT_USECS_MIN = 100, /* usecs */ >>> + TPM_TIMEOUT_USECS_MAX = 500, /* usecs */ >>> + TPM_ATML_TIMEOUT_WAIT_STAT_MIN = 14700, /* usecs */ >>> + TPM_ATML_TIMEOUT_WAIT_STAT_MAX = 15000 /* usecs */ >>> }; >>> >>> /* TPM addresses */ >>> diff --git a/drivers/char/tpm/tpm_tis_core.c b/drivers/char/tpm/tpm_tis_core.c >>> index 55b9d3965ae1..ae27d66fdd94 100644 >>> --- a/drivers/char/tpm/tpm_tis_core.c >>> +++ b/drivers/char/tpm/tpm_tis_core.c >>> @@ -80,8 +80,17 @@ static int wait_for_tpm_stat(struct tpm_chip *chip, u8 mask, >>> } >>> } else { >>> do { >>> - usleep_range(TPM_TIMEOUT_USECS_MIN, >>> - TPM_TIMEOUT_USECS_MAX); >>> + /* this code path could be executed before >>> + * timeouts initialized in chip instance. >>> + */ >>> + if (chip->timeout_wait_stat_min && >>> + chip->timeout_wait_stat_max) >>> + usleep_range(chip->timeout_wait_stat_min, >>> + chip->timeout_wait_stat_max); >>> + else >>> + usleep_range(TPM_TIMEOUT_USECS_MIN, >>> + TPM_TIMEOUT_USECS_MAX); >> >> This starts to look otherwise fine but you don't need this condition. >> Just initialize variables to TPM_TIMEOUT_USECS_{MIN, MAX} for non-Atmel. > Not sure I got your point or not. We have discussed this question a few rounds before, > I answered you about this. This check is required because before the time of > Initialization in the code I added in `tpm_tis_core_init` > ``` > + chip->timeout_wait_stat_min = TPM_TIMEOUT_USECS_MIN; > + chip->timeout_wait_stat_max = TPM_TIMEOUT_USECS_MAX; > ``` > The func `wait_for_tpm_stat` runs, we need the condition to fall back to avoid system startup crash. > > Let me know if this makes sense. If needed, I can do another confirm. I double checked this, and found the current init lines in `tpm_tis_core_init` is actually before this code path now. Maybe it was an issue in one of my old revision and I had the wrong impression. The condition seems ok to remove in the current revision. But I am not fully sure is if the behavior is consistent across other 1.2 chips, and TPM 2.0 chips. Should we still keep the condition for robustness or ship without it ? >> /Jarkko > > Hao Hao