Re: [PATCH v4 05/12] x86/mm: add INVLPGB support code

Tom Lendacky <thomas.lendacky@xxxxxxx> · Tue, 14 Jan 2025 10:30:40 -0600

On 1/14/25 09:47, Rik van Riel wrote:
> On Tue, 2025-01-14 at 09:23 -0600, Tom Lendacky wrote:
>> On 1/14/25 09:05, Dave Hansen wrote:
>>> On 1/14/25 06:29, Tom Lendacky wrote:
>>>>> Given the choice between "a bug in the calling code
>>>>> crashes the kernel" and "a bug in the calling code
>>>>> results in a missed TLB flush", I'm guessing the
>>>>> crash is probably better.
>>>> So instead of the negative number protection, shouldn't this just
>>>> use an
>>>> unsigned int for extra_count and panic() if the value is greater
>>>> than
>>>> invlpgb_count_max? The caller has some sort of logic problem and
>>>> it
>>>> could possibly result in missed TLB flushes. Or if a panic() is
>>>> out of
>>>> the question, maybe a WARN() and a full TLB flush to be safe?
>>>
>>> The current implementation will panic in the #GP handler though. It
>>> should be pretty easy to figure out that INVLPGB is involved with
>>> RIP or
>>> the Code: snippet. From there, you'd need to figure out what caused
>>> the #GP.
>>
>> Hmmm, maybe I'm missing something. IIUC, when a negative number is
>> supplied, the extra_count field will be set to 0 (via the max()
>> function) and allow the INVLPGB to continue. 0 is valid in ECX[15:0]
>> and
>> so the instruction won't #GP.
> 
> I added that at the request of somebody else :)
> 
> Let me remove it again, now that we seem to have a
> consensus that a panic is preferable to a wrong
> TLB flush.

I believe the instruction will #GP if any of the ECX[30:16] reserved
bits are non-zero (although the APM doesn't document that), in addition
to ECX[15:0] being greater than allowed. But what if 0x80000000 is
passed in. That would set ECX[31] with a zero count field, which is
valid for the instruction, but the input is obviously bogus.

I think the safest thing to do is make the extra_count parameter an
unsigned int and check if it is greater than invlpgb_count_max. Not sure
what to actually do at that point, though... panic()? WARN() with full
TLB flush?

Thanks,
Tom

>