On 08/06/2012 11:25 PM, Scott Wood wrote: > On 08/05/2012 04:00 AM, Avi Kivity wrote: >> On 08/04/2012 01:32 AM, Benjamin Herrenschmidt wrote: >>> On Fri, 2012-08-03 at 15:05 -0300, Marcelo Tosatti wrote: >>> >>>> See kvm_arch_process_async_events() call to qemu_system_reset_request() >>>> in target-i386/kvm.c. >>>> >>>> The whole thing is fragile, though: we rely on the order events >>>> are processed inside KVM_RUN, in x86: >>>> >>>> 1) If there is pending MMIO, process it. >>>> 2) If not, return with -EINTR (and KVM_EXIT_INTR) in case >>>> there is a signal pending. >>>> >>>> That way, the vcpu will not process the stop event from the main loop >>>> (ie not exit from the kvm_cpu_exec() loop), until MMIO is finished. >>> >>> Right, it is fragile, thankfully we appear to adhere to the same >>> ordering on powerpc so far :-) >>> >>> So we'll need to test but it looks like we might be able to fix our >>> problem without a kernel or API change, just by changing qemu to >>> do the same exit_request trick for our reboot hypercall. >>> >>> Long run however, I wonder whether we should consider an explicit ioctl >>> to complete those pending operations instead... >> >> It's pointless. We have to support the old method forever. > > Not in new architectures (even PPC has yet to start using this) or new > userspaces -- and forever is a long time. People down the road may very > well decide that it's time to clean out the deprecated stuff that hasn't > been used in over a decade. IMHO this shouldn't be a reason to > not improve the API, as long as compatibility is possible for as long as > it is deemed worthwhile. For qemu this is in common code (kvm-all.c); some architectures might need wiring up, but the code is there. For the kernel we need to handle signals, and we need to check for signals in the atomic guest entry path. That leads naturally to the completion-before-signal order. In fact there's no other way to do it. If we check for signals before handling completions, there's no way for userspace to know whether the completion was completely handled, and whether more completions are necessary (mmio completions are limited to 8 bytes but some x86 instructions generate many more accesses). > >> There's no >> material different between sigqueue() + KVM_RUN and KVM_COMPLETE, or a >> KVM_RUN with a flag that tells it to exit immediately. > > The latter is less fragile and easier to use. if (need_exit) queue signal vs run->need_exit = need_exit -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html