On Sun Dec 29 19, Dan Williams wrote:
On Sat, Dec 28, 2019 at 9:17 AM Dan Williams <dan.j.williams@xxxxxxxxx> wrote:
On Sat, Dec 28, 2019 at 7:15 AM Jarkko Sakkinen
<jarkko.sakkinen@xxxxxxxxxxxxxxx> wrote:
>
> On Fri, Dec 27, 2019 at 08:11:50AM +0200, Jarkko Sakkinen wrote:
> > Dan, please also test the branch and tell if other patches are needed.
> > I'm a bit blind with this as I don't have direct access to the faulting
> > hardware. Thanks. [*]
> >
> > [*] https://lkml.org/lkml/2019/12/27/12
>
> Given that:
>
> 1. I cannot reproduce the bug locally.
> 2. Neither of the patches have any appropriate tags (tested-by and
> reviewed-by). [*]
>
> I'm sorry but how am I expected to include these patches?
Thanks for the branch, I'll get it tested on the failing hardware.
Might be a few days due to holiday lag.
This looked like the wrong revert to me, and testing confirms that
this does not fix the problem.
As I mentioned in the original report [1] the commit that bisect flagged was:
5b359c7c4372 tpm_tis_core: Turn on the TPM before probing IRQ's
That commit moved tpm_chip_start() before irq probing. Commit
21df4a8b6018 "tpm_tis: reserve chip for duration of tpm_tis_core_init"
does not appear to change anything in that regard.
Perhaps this hardware has always had broken interrupts and needs to be
quirked off? I'm trying an experiment with tpm_tis_core.interrupts=0
workaround.
Hi Dan,
Just to make sure I understand correctly are you saying you still have
the screaming interrupt with the flag commit reverted, or that it is
polling instead of using interrupts [2]? Was that testing with both
commits reverted, or just the flag commit? What kernel were you
running before you saw the issue with 5.3 stable? On that kernel you
weren't seeing the polling message, and interrupts were working? Are
you able to boot a 5.0 kernel on the system? It would be interesting
to see how it was behaving before the power gating changes. I think it
would be using polling due to how the code behaves because of that
flag. It looks like without the flag being enabled by Stefan's commit
TPM_GLOBAL_INT_ENABLE will never get cleared because tpm_tis_probe_irq_single
expects tpm_tis_send to clear it if there is a problem, and without the
flag being set that whole section of code is skipped.
Unfortunately I'm having no luck tracking down a system where I can actually
test and debug this interrupt code.
Reverting the following commits should get you to a point where it is using
polling at least. This will bring back Christian's problem with tpm_get_timeouts
failing, but that can be solved with wrapping that with tpm_chip_start/tpm_chip_stop.
Christian, were you having any issues with interrupts? You system was going into
this code as well.
21df4a8b6018 | 2019-12-17 | tpm_tis: reserve chip for duration of tpm_tis_core_init (Jerry Snitselaar)
1ea32c83c699 | 2019-09-02 | tpm_tis_core: Set TPM_CHIP_FLAG_IRQ before probing for interrupts (Stefan Berger)
5b359c7c4372 | 2019-09-02 | tpm_tis_core: Turn on the TPM before probing IRQ's (Stefan Berger)
Jarkko, another problem has been reported that appears to have shown up around the time of the gating changes:
[ 4.098104] tpm_tis MSFT0101:00: 2.0 TPM (device-id 0xFE, rev-id 2)
[ 5.138572] ima: Error Communicating to TPM chip
[ 5.143727] ima: Error Communicating to TPM chip
[ 5.148881] ima: Error Communicating to TPM chip
[ 5.154037] ima: Error Communicating to TPM chip
[ 5.159209] ima: Error Communicating to TPM chip
[ 5.164370] ima: Error Communicating to TPM chip
[ 5.169517] ima: Error Communicating to TPM chip
[ 5.174673] ima: Error Communicating to TPM chip
I've located a system where it occurs, so I'll try to bisect and figure out what is going wrong.
Regards,
Jerry
[2] https://lore.kernel.org/linux-integrity/CAPcyv4iepQup4bwMuWzq6r5gdx83hgYckUWFF7yF=rszjz3dtQ@xxxxxxxxxxxxxx/
[1]: https://lore.kernel.org/linux-integrity/CAA9_cmeLnHK4y+usQaWo72nUG3RNsripuZnS-koY4XTRC+mwJA@xxxxxxxxxxxxxx/