On 04.06.2011, at 12:47, Ingo Molnar wrote: > > * Alexander Graf <agraf@xxxxxxx> wrote: > >> >> On 04.06.2011, at 12:35, Ingo Molnar wrote: >> >>> >>> * Sasha Levin <levinsasha928@xxxxxxxxx> wrote: >>> >>>> On Sat, 2011-06-04 at 12:17 +0200, Ingo Molnar wrote: >>>>> * Sasha Levin <levinsasha928@xxxxxxxxx> wrote: >>>>> >>>>>> On Sat, 2011-06-04 at 11:38 +0200, Ingo Molnar wrote: >>>>>>> * Sasha Levin <levinsasha928@xxxxxxxxx> wrote: >>>>>>> >>>>>>>> Coalescing MMIO allows us to avoid an exit every time we have a >>>>>>>> MMIO write, instead - MMIO writes are coalesced in a ring which >>>>>>>> can be flushed once an exit for a different reason is needed. >>>>>>>> A MMIO exit is also trigged once the ring is full. >>>>>>>> >>>>>>>> Coalesce all MMIO regions registered in the MMIO mapper. >>>>>>>> Add a coalescing handler under kvm_cpu. >>>>>>> >>>>>>> Does this have any effect on latency? I.e. does the guest side >>>>>>> guarantee that the pending queue will be flushed after a group of >>>>>>> updates have been done? >>>>>> >>>>>> Theres nothing that detects groups of MMIO writes, but the ring size is >>>>>> a bit less than PAGE_SIZE (half of it is overhead - rest is data) and >>>>>> we'll exit once the ring is full. >>>>> >>>>> But if the page is only filled partially and if mmio is not submitted >>>>> by the guest indefinitely (say it runs a lot of user-space code) then >>>>> the mmio remains pending in the partial-page buffer? >>>> >>>> We flush the ring on any exit from the guest, not just MMIO exit. >>>> But yes, from what I understand from the code - if the buffer is only >>>> partially full and we don't take an exit, the buffer doesn't get back to >>>> the host. >>>> >>>> ioeventfds and such are making exits less common, so yes - it's possible >>>> we won't have an exit in a while. >>>> >>>>> If that's how it works then i *really* don't like this, this looks >>>>> like a seriously mis-designed batching feature which might have >>>>> improved a few server benchmarks but which will introduce random, >>>>> hard to debug delays all around the place! >>> >>> The proper way to implement batching is not to do it blindly like >>> here, but to do what we do in the TLB coalescing/gather code in the >>> kernel: >>> >>> gather(); >>> >>> ... submit individual TLB flushes ... >>> >>> flush(); >>> >>> That's how it should be done here too: each virtio driver that issues >> >> The world doesn't consist of virtio drivers. It also doesn't >> consist of only OSs and drivers that we control 100%. > > So? I only inquired about latencies, asking what impact on latencies > is. Regardless of the circumstances we do not want to introduce > unbound latencies. > > If there are no unbound latencies then i'm happy. Sure, I'm just saying that the mechanism was invented for unmodified guests :). > >>> a group of MMIOs should first start batching, then issue the >>> individual MMIOs and then flush them. >>> >>> That can be simplified to leave out the gather() phase, i.e. just >>> issue batched MMIOs and flush them before exiting the virtio >>> (guest side) driver routines. >> >> This acceleration is done to speed up the host kernel<->userspace >> side. > > Yes. > >> [...] It's completely independent from the guest. [...] > > Well, since user-space gets the MMIOs only once the guest exits it's > not independent, is it? If we don't know when a guest ends an MMIO stream, we can't optimize it. Period. If we currently optimize random MMIO requests without caring when they finish, the following would simply break: enable_interrupts(); writel(doorbell, KICK_ME_NOW); while(1) ; void interrupt_handler(void) { break_out_of_loop(); } And since we don't control the guest, we can't guarantee this to not happen. In fact, I'd actually expect this to be a pretty normal boot loader pattern. > >> [...] If you want to have the guest communicate fast, create an >> asynchronous ring and process that. And that's what virtio already >> does today. >> >>> KVM_CAP_COALESCED_MMIO is an unsafe shortcut hack in its current >>> form and it looks completely unsafe. >> >> I haven't tracked the history of it, but I always assumed it was >> used for repz mov instructions where we already know the size of >> mmio transactions. > > That's why i asked what the effect on latencies is. If there's no > negative effect then i'm a happy camper. Depends on the trade-off really. You don't care about latencies of disabling an IRQ_ENABLED register for example. You do however care about enabling it :). Since I haven't implemented coalesced mmio on PPC (yet - not sure it's possible or makes sense), I can't really comment too much on it, so I'll leave this to the guys who worked on it. Alex -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html