On 7/28/2023 4:38 PM, Linus Torvalds wrote:
On Fri, 28 Jul 2023 at 14:01, Limonciello, Mario
<mario.limonciello@xxxxxxx> wrote:
That's exactly why I was asking in the kernel bugzilla if something
similar gets tripped up by RDRAND.
So that would sound very unlikely, but who knows... Microcode can
obviously do pretty much anything at all, but at least the original
fTPM issues _seemed_ to be about BIOS doing truly crazy things like
SPI flash accesses.
I can easily imagine a BIOS fTPM code using some absolutely horrid
global "EFI synchronization" lock or whatever, which could then cause
random problems just based on some entirely unrelated activity.
I would not be surprised, for example, if wasn't the fTPM hwrnd code
itself that decided to read some random number from SPI, but that it
simply got serialized with something else that the BIOS was involved
with. It's not like BIOS people are famous for their scalable code
that is entirely parallel...
And I'd be _very_ surprised if CPU microcode does anything even
remotely like that. Not impossible - HP famously screwed with the time
stamp counter with SMIs, and I could imagine them - or others - doing
the same with rdrand.
But it does sound pretty damn unlikely, compared to "EFI BIOS uses a
one big lock approach".
So rdrand (and rdseed in particular) can be rather slow, but I think
we're talking hundreds of CPU cycles (maybe low thousands). Nothing
like the stuttering reports we've seen from fTPM.
Linus
Your theory sounds totally plausible and it would explain why even
though this system has the fixes from the original issue it's tripping a
similar behavior.
Based on the argument of RDRAND being on the same SOC I think it's a
pretty good argument to drop contributing to the hwrng entropy
*anything* that's not a dTPM.