Re: [PATCH] kvm tools: Add MMIO coalescing support

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 04.06.2011, at 12:47, Ingo Molnar wrote:

> 
> * Alexander Graf <agraf@xxxxxxx> wrote:
> 
>> 
>> On 04.06.2011, at 12:35, Ingo Molnar wrote:
>> 
>>> 
>>> * Sasha Levin <levinsasha928@xxxxxxxxx> wrote:
>>> 
>>>> On Sat, 2011-06-04 at 12:17 +0200, Ingo Molnar wrote:
>>>>> * Sasha Levin <levinsasha928@xxxxxxxxx> wrote:
>>>>> 
>>>>>> On Sat, 2011-06-04 at 11:38 +0200, Ingo Molnar wrote:
>>>>>>> * Sasha Levin <levinsasha928@xxxxxxxxx> wrote:
>>>>>>> 
>>>>>>>> Coalescing MMIO allows us to avoid an exit every time we have a
>>>>>>>> MMIO write, instead - MMIO writes are coalesced in a ring which
>>>>>>>> can be flushed once an exit for a different reason is needed.
>>>>>>>> A MMIO exit is also trigged once the ring is full.
>>>>>>>> 
>>>>>>>> Coalesce all MMIO regions registered in the MMIO mapper.
>>>>>>>> Add a coalescing handler under kvm_cpu.
>>>>>>> 
>>>>>>> Does this have any effect on latency? I.e. does the guest side 
>>>>>>> guarantee that the pending queue will be flushed after a group of 
>>>>>>> updates have been done?
>>>>>> 
>>>>>> Theres nothing that detects groups of MMIO writes, but the ring size is
>>>>>> a bit less than PAGE_SIZE (half of it is overhead - rest is data) and
>>>>>> we'll exit once the ring is full.
>>>>> 
>>>>> But if the page is only filled partially and if mmio is not submitted 
>>>>> by the guest indefinitely (say it runs a lot of user-space code) then 
>>>>> the mmio remains pending in the partial-page buffer?
>>>> 
>>>> We flush the ring on any exit from the guest, not just MMIO exit.
>>>> But yes, from what I understand from the code - if the buffer is only
>>>> partially full and we don't take an exit, the buffer doesn't get back to
>>>> the host.
>>>> 
>>>> ioeventfds and such are making exits less common, so yes - it's possible
>>>> we won't have an exit in a while.
>>>> 
>>>>> If that's how it works then i *really* don't like this, this looks 
>>>>> like a seriously mis-designed batching feature which might have 
>>>>> improved a few server benchmarks but which will introduce random, 
>>>>> hard to debug delays all around the place!
>>> 
>>> The proper way to implement batching is not to do it blindly like 
>>> here, but to do what we do in the TLB coalescing/gather code in the 
>>> kernel:
>>> 
>>> 	gather();
>>> 
>>> 	... submit individual TLB flushes ...
>>> 
>>> 	flush();
>>> 
>>> That's how it should be done here too: each virtio driver that issues 
>> 
>> The world doesn't consist of virtio drivers. It also doesn't 
>> consist of only OSs and drivers that we control 100%.
> 
> So? I only inquired about latencies, asking what impact on latencies 
> is. Regardless of the circumstances we do not want to introduce 
> unbound latencies.
> 
> If there are no unbound latencies then i'm happy.

Sure, I'm just saying that the mechanism was invented for unmodified guests :).

> 
>>> a group of MMIOs should first start batching, then issue the 
>>> individual MMIOs and then flush them.
>>> 
>>> That can be simplified to leave out the gather() phase, i.e. just 
>>> issue batched MMIOs and flush them before exiting the virtio 
>>> (guest side) driver routines.
>> 
>> This acceleration is done to speed up the host kernel<->userspace 
>> side.
> 
> Yes.
> 
>> [...] It's completely independent from the guest. [...]
> 
> Well, since user-space gets the MMIOs only once the guest exits it's 
> not independent, is it?

If we don't know when a guest ends an MMIO stream, we can't optimize it. Period. If we currently optimize random MMIO requests without caring when they finish, the following would simply break:

enable_interrupts();
writel(doorbell, KICK_ME_NOW);
while(1) ;

void interrupt_handler(void)
{
    break_out_of_loop();
}

And since we don't control the guest, we can't guarantee this to not happen. In fact, I'd actually expect this to be a pretty normal boot loader pattern.

> 
>> [...] If you want to have the guest communicate fast, create an 
>> asynchronous ring and process that. And that's what virtio already 
>> does today.
>> 
>>> KVM_CAP_COALESCED_MMIO is an unsafe shortcut hack in its current 
>>> form and it looks completely unsafe.
>> 
>> I haven't tracked the history of it, but I always assumed it was 
>> used for repz mov instructions where we already know the size of 
>> mmio transactions.
> 
> That's why i asked what the effect on latencies is. If there's no 
> negative effect then i'm a happy camper.

Depends on the trade-off really. You don't care about latencies of disabling an IRQ_ENABLED register for example. You do however care about enabling it :).

Since I haven't implemented coalesced mmio on PPC (yet - not sure it's possible or makes sense), I can't really comment too much on it, so I'll leave this to the guys who worked on it.


Alex

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux