On 2011-02-01 15:37, Jan Kiszka wrote: > On 2011-02-01 15:21, Jan Kiszka wrote: >> On 2011-02-01 15:10, Marcelo Tosatti wrote: >>> On Tue, Feb 01, 2011 at 02:58:02PM +0100, Jan Kiszka wrote: >>>> On 2011-02-01 14:48, Marcelo Tosatti wrote: >>>>> On Tue, Feb 01, 2011 at 02:32:38PM +0100, Jan Kiszka wrote: >>>>>> On 2011-02-01 13:47, Marcelo Tosatti wrote: >>>>>>> On Thu, Jan 27, 2011 at 02:09:58PM +0100, Jan Kiszka wrote: >>>>>>>> Found by Stefan Hajnoczi: There is a race in kvm_cpu_exec between >>>>>>>> checking for exit_request on vcpu entry and timer signals arriving >>>>>>>> before KVM starts to catch them. Plug it by blocking both timer related >>>>>>>> signals also on !CONFIG_IOTHREAD and process those via signalfd. >>>>>>>> >>>>>>>> Signed-off-by: Jan Kiszka <jan.kiszka@xxxxxxxxxxx> >>>>>>>> CC: Stefan Hajnoczi <stefanha@xxxxxxxxxxxxxxxxxx> >>>>>>>> --- >>>>>>>> cpus.c | 18 ++++++++++++++++++ >>>>>>>> 1 files changed, 18 insertions(+), 0 deletions(-) >>>>>>>> >>>>>>>> diff --git a/cpus.c b/cpus.c >>>>>>>> index fc3f222..29b1070 100644 >>>>>>>> --- a/cpus.c >>>>>>>> +++ b/cpus.c >>>>>>>> @@ -254,6 +254,10 @@ static void qemu_kvm_init_cpu_signals(CPUState *env) >>>>>>>> pthread_sigmask(SIG_BLOCK, NULL, &set); >>>>>>>> sigdelset(&set, SIG_IPI); >>>>>>>> sigdelset(&set, SIGBUS); >>>>>>>> +#ifndef CONFIG_IOTHREAD >>>>>>>> + sigdelset(&set, SIGIO); >>>>>>>> + sigdelset(&set, SIGALRM); >>>>>>>> +#endif >>>>>>> >>>>>>> I'd prefer separate qemu_kvm_init_cpu_signals in the !IOTHREAD >>>>>>> section. >>>>>> >>>>>> You mean to duplicate qemu_kvm_init_cpu_signals for both configurations? >>>>> >>>>> Yes, so to avoid #ifdefs spread. >>>> >>>> Would exchange some #ifdefs against ifndef _WIN32. Haven't measured the >>>> delta though. >>>> >>>>> >>>>>>>> + >>>>>>>> +#ifndef CONFIG_IOTHREAD >>>>>>>> + if (sigismember(&chkset, SIGIO) || sigismember(&chkset, SIGALRM)) { >>>>>>>> + qemu_notify_event(); >>>>>>>> + } >>>>>>>> +#endif >>>>>>> >>>>>>> Why is this necessary? >>>>>>> >>>>>>> You should break out of cpu_exec_all if there's a pending alarm (see >>>>>>> qemu_alarm_pending()). >>>>>> >>>>>> qemu_alarm_pending() is not true until the signal is actually taken. The >>>>>> alarm handler sets the required flags. >>>>> >>>>> Right. What i mean is you need to execute the signal handler inside >>>>> cpu_exec_all loop (so that alarm pending is set). >>>>> >>>>> So, if there is a SIGALRM pending, qemu_run_timers has highest >>>>> priority, not vcpu execution. >>>> >>>> We leave the vcpu loop (thanks to notify_event), process the signal in >>>> the event loop and run the timer handler. This pattern is IMO less >>>> invasive to the existing code, specifically as it is about to die >>>> long-term anyway. >>> >>> You'll probably see poor timer behaviour on smp guests without iothread >>> enabled. >>> >> >> Still checking, but that would mean the notification mechanism is broken >> anyway: If IO events do not force us to process them quickly, we already >> suffer from latencies in SMP mode. > > Maybe a regression caused by the iothread introduction: I take it back, the issue is actually much older. > we need to break > out of the cpu loop via global exit_request when there is an IO event > pending. Fixing this will also heal the above issue. > > Sigh, we need to get rid of those two implementations and focus all > reviewing and testing on one. I bet there are still more corner cases > sleeping somewhere. > > Jan > Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html