On Tue, Mar 12, 2019 at 01:04:58PM -0400, Mimi Zohar wrote: > On Mon, 2019-03-11 at 16:54 -0700, Calvin Owens wrote: > > We're having lots of problems with TPM commands timing out, and we're > > seeing these problems across lots of different hardware (both v1/v2). > > > > I instrumented the driver to collect latency data, but I wasn't able to > > find any specific timeout to fix: it seems like many of them are too > > aggressive. So I tried replacing all the timeout logic with a single > > universal long timeout, and found that makes our TPMs 100% reliable. > > > > Given that this timeout logic is very complex, problematic, and appears > > to serve no real purpose, I propose simply deleting all of it. > > Normally before sending such a massive change like this, included in > the bug report or patch description, there would be some indication as > to which kernel introduced a regression. Has this always been a > problem? Is this something new? How new? Also: is the problem in timeouts, durations or both. Does make sense to fix something that isn't broken... /Jarkko