Re: [PATCH] tpm: WARN_ONCE() -> pr_warn_once() in tpm_tis_status()

Jarkko Sakkinen <jarkko@xxxxxxxxxx> · Wed, 3 Feb 2021 02:01:00 +0200

On Tue, Feb 02, 2021 at 03:00:34PM -0800, James Bottomley wrote:
> On Wed, 2021-02-03 at 00:27 +0200, Jarkko Sakkinen wrote:
> > On Tue, Feb 02, 2021 at 09:58:24AM -0800, James Bottomley wrote:
> > > On Tue, 2021-02-02 at 11:26 -0600, Serge E. Hallyn wrote:
> [...]
> > > > 
> > > > Actually in this case I don't understand why _once, especially
> > > > based on the comment.  Would ratelimited not be better?  So we
> > > > can see if it happens repeatedly?  Even better would be if we
> > > > could see when it next gave a valid status after an invalid one.
> > > 
> > > The reason was that we're trying to catch and kill paths to the
> > > status where the locality is incorrect.  If you do some operation
> > > that finds an incorrect path the likelihood is you'll exercise it
> > > more than once, but all we need to identify it is the call trace
> > > from a single access.  The symptom the user space process sees is a
> > > TPM timeout, but we still have the in-kernel trace to tell us why.
> > 
> > I don't agree with this reasoning. This warn could spun off also from
> > chip not following TCG spec.
> 
> If it doesn't follow this basic part of the spec, the chip is unusable
> by us anyway because we need the status to proceed with command
> handling.
> 
> >  The patch does provide the status code, which is always useful
> > information.
> 
> In the wrong locality that will be bus not connected, so likely 0xff. 
> The most useful thing to know is what path triggered the condition
> because the most likely cause is coding error.
> 
> James

I tend to agree for now. Let's focus on collecting the fixes. Thanks.

/Jarkko