On Tue, Nov 24, 2020 at 05:27:30AM +0200, Jarkko Sakkinen wrote: > On Thu, Nov 19, 2020 at 03:42:35PM +0100, Hans de Goede wrote: > > Hi, > > > > On 11/19/20 7:36 AM, Jerry Snitselaar wrote: > > > > > > Matthew Garrett @ 2020-10-15 15:39 MST: > > > > > >> On Thu, Oct 15, 2020 at 2:44 PM Jerry Snitselaar <jsnitsel@xxxxxxxxxx> wrote: > > >>> > > >>> There is a misconfiguration in the bios of the gpio pin used for the > > >>> interrupt in the T490s. When interrupts are enabled in the tpm_tis > > >>> driver code this results in an interrupt storm. This was initially > > >>> reported when we attempted to enable the interrupt code in the tpm_tis > > >>> driver, which previously wasn't setting a flag to enable it. Due to > > >>> the reports of the interrupt storm that code was reverted and we went back > > >>> to polling instead of using interrupts. Now that we know the T490s problem > > >>> is a firmware issue, add code to check if the system is a T490s and > > >>> disable interrupts if that is the case. This will allow us to enable > > >>> interrupts for everyone else. If the user has a fixed bios they can > > >>> force the enabling of interrupts with tpm_tis.interrupts=1 on the > > >>> kernel command line. > > >> > > >> I think an implication of this is that systems haven't been > > >> well-tested with interrupts enabled. In general when we've found a > > >> firmware issue in one place it ends up happening elsewhere as well, so > > >> it wouldn't surprise me if there are other machines that will also be > > >> unhappy with interrupts enabled. Would it be possible to automatically > > >> detect this case (eg, if we get more than a certain number of > > >> interrupts in a certain timeframe immediately after enabling the > > >> interrupt) and automatically fall back to polling in that case? It > > >> would also mean that users with fixed firmware wouldn't need to pass a > > >> parameter. > > > > > > I believe Matthew is correct here. I found another system today > > > with completely different vendor for both the system and the tpm chip. > > > In addition another Lenovo model, the L490, has the issue. > > > > > > This initial attempt at a solution like Matthew suggested works on > > > the system I found today, but I imagine it is all sorts of wrong. > > > In the 2 systems where I've seen it, there are about 100000 interrupts > > > in around 1.5 seconds, and then the irq code shuts down the interrupt > > > because they aren't being handled. > > > > Is that with your patch? The IRQ should be silenced as soon as > > devm_free_irq(chip->dev.parent, priv->irq, chip); is called. > > > > Depending on if we can get your storm-detection to work or not, > > we might also choose to just never try to use the IRQ (at least on > > x86 systems). AFAIK the TPM is never used for high-throughput stuff > > so the polling overhead should not be a big deal (and I'm getting the feeling > > that Windows always polls). > > > > Regards, > > > > Hans > > Yeah, this is what I've been wondering for a while. Why could not we > just strip off IRQ code? Why does it matter? And we DO NOT use interrupts in tpm_crb and nobody has ever complained. /Jarkko