On 2011-02-01 15:10, Marcelo Tosatti wrote: > On Tue, Feb 01, 2011 at 02:58:02PM +0100, Jan Kiszka wrote: >> On 2011-02-01 14:48, Marcelo Tosatti wrote: >>> On Tue, Feb 01, 2011 at 02:32:38PM +0100, Jan Kiszka wrote: >>>> On 2011-02-01 13:47, Marcelo Tosatti wrote: >>>>> On Thu, Jan 27, 2011 at 02:09:58PM +0100, Jan Kiszka wrote: >>>>>> Found by Stefan Hajnoczi: There is a race in kvm_cpu_exec between >>>>>> checking for exit_request on vcpu entry and timer signals arriving >>>>>> before KVM starts to catch them. Plug it by blocking both timer related >>>>>> signals also on !CONFIG_IOTHREAD and process those via signalfd. >>>>>> >>>>>> Signed-off-by: Jan Kiszka <jan.kiszka@xxxxxxxxxxx> >>>>>> CC: Stefan Hajnoczi <stefanha@xxxxxxxxxxxxxxxxxx> >>>>>> --- >>>>>> cpus.c | 18 ++++++++++++++++++ >>>>>> 1 files changed, 18 insertions(+), 0 deletions(-) >>>>>> >>>>>> diff --git a/cpus.c b/cpus.c >>>>>> index fc3f222..29b1070 100644 >>>>>> --- a/cpus.c >>>>>> +++ b/cpus.c >>>>>> @@ -254,6 +254,10 @@ static void qemu_kvm_init_cpu_signals(CPUState *env) >>>>>> pthread_sigmask(SIG_BLOCK, NULL, &set); >>>>>> sigdelset(&set, SIG_IPI); >>>>>> sigdelset(&set, SIGBUS); >>>>>> +#ifndef CONFIG_IOTHREAD >>>>>> + sigdelset(&set, SIGIO); >>>>>> + sigdelset(&set, SIGALRM); >>>>>> +#endif >>>>> >>>>> I'd prefer separate qemu_kvm_init_cpu_signals in the !IOTHREAD >>>>> section. >>>> >>>> You mean to duplicate qemu_kvm_init_cpu_signals for both configurations? >>> >>> Yes, so to avoid #ifdefs spread. >> >> Would exchange some #ifdefs against ifndef _WIN32. Haven't measured the >> delta though. >> >>> >>>>>> + >>>>>> +#ifndef CONFIG_IOTHREAD >>>>>> + if (sigismember(&chkset, SIGIO) || sigismember(&chkset, SIGALRM)) { >>>>>> + qemu_notify_event(); >>>>>> + } >>>>>> +#endif >>>>> >>>>> Why is this necessary? >>>>> >>>>> You should break out of cpu_exec_all if there's a pending alarm (see >>>>> qemu_alarm_pending()). >>>> >>>> qemu_alarm_pending() is not true until the signal is actually taken. The >>>> alarm handler sets the required flags. >>> >>> Right. What i mean is you need to execute the signal handler inside >>> cpu_exec_all loop (so that alarm pending is set). >>> >>> So, if there is a SIGALRM pending, qemu_run_timers has highest >>> priority, not vcpu execution. >> >> We leave the vcpu loop (thanks to notify_event), process the signal in >> the event loop and run the timer handler. This pattern is IMO less >> invasive to the existing code, specifically as it is about to die >> long-term anyway. > > You'll probably see poor timer behaviour on smp guests without iothread > enabled. > Still checking, but that would mean the notification mechanism is broken anyway: If IO events do not force us to process them quickly, we already suffer from latencies in SMP mode. Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html