Re: [PATCH 1/2] tpm, tpm_tis: Handle interrupt storm

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, May 23, 2023 at 12:14:28PM +0300, Péter Ujfalusi wrote:
> On 23/05/2023 10:44, Lukas Wunner wrote:
> > On Tue, May 23, 2023 at 09:48:23AM +0300, Péter Ujfalusi wrote:
> >> On 22/05/2023 17:31, Lino Sanfilippo wrote:
> > [...]
> >> This looked promising, however it looks like the UPX-i11 needs the DMI
> >> quirk.
> > 
> > Why is that?  Is there a fundamental problem with the patch or is it
> > a specific issue with that device?
> 
> The flood is not detected (if there is a flood at all), interrupt stops
> working after about 200 interrupts - in the latest boot at 118th.

You've got a variant of the "never asserted interrupt".

That condition is currently tested only once on probe in tpm_tis_core_init().
The solution would be to disable interrupts whenever they're not (or no
longer asserted).  

However, that's a distinct issue from the one addressed by the present
patch, which deals with a "never *de*asserted interrupt".


> >>> +	dev_err(&chip->dev, HW_ERR
> >>> +		"TPM interrupt storm detected, polling instead\n");
> >>
> >> Should this be dev_warn or even dev_info level?
> > 
> > The corresponding message emitted in tpm_tis_core_init() for
> > an interrupt that's *never* asserted uses dev_err(), so using
> > dev_err() here as well serves consistency:
> > 
> > 	dev_err(&chip->dev, FW_BUG
> > 		"TPM interrupt not working, polling instead\n");
> > 
> > That way the same severity is used both for the never asserted and
> > the never deasserted interrupt case.
> 
> Oh, OK.
> Is there anything the user can do to have a ERROR less boot?

You're right that the user can't do anything about it and that
toning the message down to KERN_WARN or even KERN_NOTICE severity
may be appropriate.

However the above-quoted message for the never asserted interrupt
in tpm_tis_core_init() should then likewise be toned down to the
same severity.

I'm wondering why that message uses FW_BUG.  That doesn't make any
sense to me.  It's typically not a firmware bug, but a hardware issue,
e.g. an interrupt pin may erroneously not be connected or may be
connected to ground.  Lino used HW_ERR, which seems more appropriate
to me.


> >>>  	rc = tpm_tis_write32(priv, TPM_INT_STATUS(priv->locality), interrupt);
> >>>  	tpm_tis_relinquish_locality(chip, 0);
> >>>  	if (rc < 0)
> >>> -		return IRQ_NONE;
> >>> +		goto unhandled;
> >>
> >> This is more like an error than just unhandled IRQ. Yes, it was ignored,
> >> probably because it is common?
> > 
> > The interrupt may be shared and then it's not an error.
> 
> but this is tpm_tis_write32() failing, no? If it is shared interrupt and
> we return IRQ_HANDLED unconditionally then I think the core will think
> that the interrupt was for this device and it was handled.

No.  The IRQ_HANDLED versus IRQ_NONE return values are merely used
for book-keeping of spurious interrupts.  If IRQ_HANDLED is returned,
the other handlers will still be invoked.  It is not discernible whether
a shared interrupt was asserted by a single device or by multiple devices,
so all handlers need to be called.

Thanks,

Lukas



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Linux Kernel]     [Linux Kernel Hardening]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux SCSI]

  Powered by Linux