Re: Reset problem vs. MMIO emulation, hypercalls, etc...

Avi Kivity <avi@xxxxxxxxxx> · Tue, 07 Aug 2012 11:44:07 +0300

On 08/06/2012 11:25 PM, Scott Wood wrote:
> On 08/05/2012 04:00 AM, Avi Kivity wrote:
>> On 08/04/2012 01:32 AM, Benjamin Herrenschmidt wrote:
>>> On Fri, 2012-08-03 at 15:05 -0300, Marcelo Tosatti wrote:
>>>
>>>> See kvm_arch_process_async_events() call to qemu_system_reset_request()
>>>> in target-i386/kvm.c.
>>>>
>>>> The whole thing is fragile, though: we rely on the order events
>>>> are processed inside KVM_RUN, in x86:
>>>>
>>>> 1) If there is pending MMIO, process it.
>>>> 2) If not, return with -EINTR (and KVM_EXIT_INTR) in case
>>>> there is a signal pending.
>>>>
>>>> That way, the vcpu will not process the stop event from the main loop
>>>> (ie not exit from the kvm_cpu_exec() loop), until MMIO is finished.
>>>
>>> Right, it is fragile, thankfully we appear to adhere to the same
>>> ordering on powerpc so far :-)
>>>
>>> So we'll need to test but it looks like we might be able to fix our
>>> problem without a kernel or API change, just by changing qemu to
>>> do the same exit_request trick for our reboot hypercall.
>>>
>>> Long run however, I wonder whether we should consider an explicit ioctl
>>> to complete those pending operations instead...
>> 
>> It's pointless.  We have to support the old method forever.
> 
> Not in new architectures (even PPC has yet to start using this) or new
> userspaces -- and forever is a long time.  People down the road may very
> well decide that it's time to clean out the deprecated stuff that hasn't
> been used in over a decade.  IMHO this shouldn't be a reason to
> not improve the API, as long as compatibility is possible for as long as
> it is deemed worthwhile.

For qemu this is in common code (kvm-all.c); some architectures might
need wiring up, but the code is there.

For the kernel we need to handle signals, and we need to check for
signals in the atomic guest entry path.  That leads naturally to the
completion-before-signal order.

In fact there's no other way to do it.  If we check for signals before
handling completions, there's no way for userspace to know whether the
completion was completely handled, and whether more completions are
necessary (mmio completions are limited to 8 bytes but some x86
instructions generate many more accesses).

> 
>> There's no
>> material different between sigqueue() + KVM_RUN and KVM_COMPLETE, or a
>> KVM_RUN with a flag that tells it to exit immediately.
> 
> The latter is less fragile and easier to use.

   if (need_exit)
        queue signal

 vs

   run->need_exit = need_exit

-- 
error compiling committee.c: too many arguments to function
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html