Re: [PATCH 0/2] sync_regs() TOCTOU issues

Michal Luczaj <mhal@xxxxxxx> · Tue, 15 Aug 2023 20:38:46 +0200

On 8/15/23 20:15, Sean Christopherson wrote:
> On Tue, Aug 15, 2023, Michal Luczaj wrote:
>> On 8/15/23 17:40, Sean Christopherson wrote:
>>> On Tue, Aug 15, 2023, Michal Luczaj wrote:
>>>>> @@ -115,6 +116,7 @@ static void *race_events_exc(void *arg)
>>>>>  	for (;;) {
>>>>>  		WRITE_ONCE(run->kvm_dirty_regs, KVM_SYNC_X86_EVENTS);
>>>>>  		WRITE_ONCE(events->flags, 0);
>>>>> +		WRITE_ONCE(events->exception.nr, GP_VECTOR);
>>>>>  		WRITE_ONCE(events->exception.pending, 1);
>>>>>  		WRITE_ONCE(events->exception.nr, 255);
>>>>
>>>> Here you're setting events->exception.nr twice. Is it deliberate?
>>>
>>> Heh, yes and no.  It's partly leftover from a brief attempt to gracefully eat the
>>> fault in the guest.
>>>
>>> However, unless there's magic I'm missing, race_events_exc() needs to set a "good"
>>> vector in every iteration, otherwise only the first iteration will be able to hit
>>> the "check good, consume bad" scenario.
>>
>> I think I understand what you mean. I see things slightly different: because
>>
>> 	if (events->flags & KVM_VCPUEVENT_VALID_PAYLOAD) {
>> 		...
>> 	} else {
>> 		events->exception.pending = 0;
>> 		events->exception_has_payload = 0;
>> 	}
>>
>> zeroes exception.pending on every iteration, even though exception.nr may
>> already be > 31, KVM does not necessary return -EINVAL at
>>
>> 	if ((events->exception.injected || events->exception.pending) &&
>> 	    (events->exception.nr > 31 || events->exception.nr == NMI_VECTOR))
>> 		return -EINVAL;
>>
>> It would if the racer set exception.pending before this check, but if it does it
>> after the check, then KVM goes
>>
>> 	vcpu->arch.exception.pending = events->exception.pending;
>> 	vcpu->arch.exception.vector = events->exception.nr;
>>
>> which later triggers the WARN. That said, if I you think setting and re-setting
>> exception.nr is more efficient (as in: racy), I'm all for it.
> 
> My goal isn't to make it easier to hit the *known* TOCTOU, it's to make the test
> more valuable after that known bug has been fixed.

Aha! Yup, turns out I did not understand what you meant after all :) Sorry.

> I.e. I don't want to rely on
> KVM to update kvm_run (which was arguably a bug even if there weren't a TOCTOU
> issue).  It's kinda silly, because realistically this test is likely only ever
> going to find TOCTOU bugs, but so long as the test can consistently the known bug,
> my preference is to make it as "generic" as possible from a coverage perspective.

Sure, that makes sense.