On Tue, Feb 7, 2023 at 11:13 PM Jarkko Sakkinen <jarkko@xxxxxxxxxx> wrote: > > On Wed, Feb 08, 2023 at 04:13:16AM +0200, Jarkko Sakkinen wrote: > > On Thu, Feb 02, 2023 at 07:57:37AM -0500, James Bottomley wrote: > > > On Thu, 2023-02-02 at 11:28 +0100, Linux kernel regression tracking > > > (Thorsten Leemhuis) wrote: > > > [...] > > > > So it's a firmware problem, but apparently one that Linux only > > > > triggers since 6.1. > > > > > > > > Jason, could the hwrng changes have anything to do with this? > > > > > > > > A bisection really would be helpful, but I guess that is not easy as > > > > the problem apparently only shows up after some time... > > > > > > the problem description says the fTPM causes system stutter when it > > > writes to NVRAM. Since an fTPM is a proprietary implementation, we > > > don't know what it does. The ms TPM implementation definitely doesn't > > > trigger NV writes on rng requests, but it is plausible this fTPM does > > > ... particularly if they have a time based input to the DRNG. Even if > > > this speculation is true, there's not much we can do about it, since > > > it's a firmware bug and AMD should have delivered the BIOS update that > > > fixes it. > > > > > > The way to test this would be to set the config option > > > > > > CONFIG_HW_RANDOM_TPM=n > > > > > > and see if the stutter goes away. I suppose if someone could quantify > > > the bad bioses, we could warn, but that's about it. > > > > > > James > > > > > > > And e.g. I do not have a Ryzen CPU so pretty hard to answer such question. > > ... about hwrng Well, the options here are basically: a) Do nothing, and just expect people to update their BIOSes, since an update is available. b) Do nothing, and expect people with broken BIOSes to `echo blacklist tpm >> /etc/modprobesomethingsomething`. c) Figure out how to identify the buggy BIOS and disable the TPM's rng with a quirk in this case. d) Figure out how to dynamically detect TPM rng latency, and warn about it. e) Figure out how to dynamically detect TPM rng latency, and disable it. I think given that a firmware update *is* available, (a) is fine. And the generic workaround remains (b). But if you want to be really nice, (c) would be fine too. Somebody with the affected hardware would probably have to send in some DMI logs or whatever else. (d) and (e) sound possible in theory but I dunno really... seems finicky. Jason