Re: [PATCH 14/22] kvm: Fix race between timer signals and vcpu entry under !IOTHREAD

Jan Kiszka <jan.kiszka@xxxxxxxxxxx> · Tue, 01 Feb 2011 15:45:12 +0100

On 2011-02-01 15:37, Jan Kiszka wrote:
> On 2011-02-01 15:21, Jan Kiszka wrote:
>> On 2011-02-01 15:10, Marcelo Tosatti wrote:
>>> On Tue, Feb 01, 2011 at 02:58:02PM +0100, Jan Kiszka wrote:
>>>> On 2011-02-01 14:48, Marcelo Tosatti wrote:
>>>>> On Tue, Feb 01, 2011 at 02:32:38PM +0100, Jan Kiszka wrote:
>>>>>> On 2011-02-01 13:47, Marcelo Tosatti wrote:
>>>>>>> On Thu, Jan 27, 2011 at 02:09:58PM +0100, Jan Kiszka wrote:
>>>>>>>> Found by Stefan Hajnoczi: There is a race in kvm_cpu_exec between
>>>>>>>> checking for exit_request on vcpu entry and timer signals arriving
>>>>>>>> before KVM starts to catch them. Plug it by blocking both timer related
>>>>>>>> signals also on !CONFIG_IOTHREAD and process those via signalfd.
>>>>>>>>
>>>>>>>> Signed-off-by: Jan Kiszka <jan.kiszka@xxxxxxxxxxx>
>>>>>>>> CC: Stefan Hajnoczi <stefanha@xxxxxxxxxxxxxxxxxx>
>>>>>>>> ---
>>>>>>>>  cpus.c |   18 ++++++++++++++++++
>>>>>>>>  1 files changed, 18 insertions(+), 0 deletions(-)
>>>>>>>>
>>>>>>>> diff --git a/cpus.c b/cpus.c
>>>>>>>> index fc3f222..29b1070 100644
>>>>>>>> --- a/cpus.c
>>>>>>>> +++ b/cpus.c
>>>>>>>> @@ -254,6 +254,10 @@ static void qemu_kvm_init_cpu_signals(CPUState *env)
>>>>>>>>      pthread_sigmask(SIG_BLOCK, NULL, &set);
>>>>>>>>      sigdelset(&set, SIG_IPI);
>>>>>>>>      sigdelset(&set, SIGBUS);
>>>>>>>> +#ifndef CONFIG_IOTHREAD
>>>>>>>> +    sigdelset(&set, SIGIO);
>>>>>>>> +    sigdelset(&set, SIGALRM);
>>>>>>>> +#endif
>>>>>>>
>>>>>>> I'd prefer separate qemu_kvm_init_cpu_signals in the !IOTHREAD
>>>>>>> section.
>>>>>>
>>>>>> You mean to duplicate qemu_kvm_init_cpu_signals for both configurations?
>>>>>
>>>>> Yes, so to avoid #ifdefs spread.
>>>>
>>>> Would exchange some #ifdefs against ifndef _WIN32. Haven't measured the
>>>> delta though.
>>>>
>>>>>
>>>>>>>> +
>>>>>>>> +#ifndef CONFIG_IOTHREAD
>>>>>>>> +    if (sigismember(&chkset, SIGIO) || sigismember(&chkset, SIGALRM)) {
>>>>>>>> +        qemu_notify_event();
>>>>>>>> +    }
>>>>>>>> +#endif
>>>>>>>
>>>>>>> Why is this necessary?
>>>>>>>
>>>>>>> You should break out of cpu_exec_all if there's a pending alarm (see
>>>>>>> qemu_alarm_pending()).
>>>>>>
>>>>>> qemu_alarm_pending() is not true until the signal is actually taken. The
>>>>>> alarm handler sets the required flags.
>>>>>
>>>>> Right. What i mean is you need to execute the signal handler inside
>>>>> cpu_exec_all loop (so that alarm pending is set).
>>>>>
>>>>> So, if there is a SIGALRM pending, qemu_run_timers has highest
>>>>> priority, not vcpu execution.
>>>>
>>>> We leave the vcpu loop (thanks to notify_event), process the signal in
>>>> the event loop and run the timer handler. This pattern is IMO less
>>>> invasive to the existing code, specifically as it is about to die
>>>> long-term anyway.
>>>
>>> You'll probably see poor timer behaviour on smp guests without iothread
>>> enabled.
>>>
>>
>> Still checking, but that would mean the notification mechanism is broken
>> anyway: If IO events do not force us to process them quickly, we already
>> suffer from latencies in SMP mode.
> 
> Maybe a regression caused by the iothread introduction:

I take it back, the issue is actually much older.

> we need to break
> out of the cpu loop via global exit_request when there is an IO event
> pending. Fixing this will also heal the above issue.
> 
> Sigh, we need to get rid of those two implementations and focus all
> reviewing and testing on one. I bet there are still more corner cases
> sleeping somewhere.
> 
> Jan
> 

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html