On Thu, Oct 25, 2012 at 1:16 AM, Vasilis Liaskovitis <vasilis.liaskovitis@xxxxxxxxxxxxxxxx> wrote: > Hi, > > On Wed, Oct 24, 2012 at 12:15:17PM +0200, Stefan Hajnoczi wrote: >> On Wed, Oct 24, 2012 at 10:06 AM, liu ping fan <qemulist@xxxxxxxxx> wrote: >> > On Tue, Oct 23, 2012 at 8:25 PM, Stefan Hajnoczi <stefanha@xxxxxxxxx> wrote: >> >> On Fri, Sep 21, 2012 at 01:17:21PM +0200, Vasilis Liaskovitis wrote: >> >>> +static void dimm_populate(DimmDevice *s) >> >>> +{ >> >>> + DeviceState *dev= (DeviceState*)s; >> >>> + MemoryRegion *new = NULL; >> >>> + >> >>> + new = g_malloc(sizeof(MemoryRegion)); >> >>> + memory_region_init_ram(new, dev->id, s->size); >> >>> + vmstate_register_ram_global(new); >> >>> + memory_region_add_subregion(get_system_memory(), s->start, new); >> >>> + s->mr = new; >> >>> +} >> >>> + >> >>> +static void dimm_depopulate(DimmDevice *s) >> >>> +{ >> >>> + assert(s); >> >>> + vmstate_unregister_ram(s->mr, NULL); >> >>> + memory_region_del_subregion(get_system_memory(), s->mr); >> >>> + memory_region_destroy(s->mr); >> >>> + s->mr = NULL; >> >>> +} >> >> >> >> How is dimm hot unplug protected against callers who currently have RAM >> >> mapped (from cpu_physical_memory_map())? >> >> >> >> Emulated devices call cpu_physical_memory_map() directly or indirectly >> >> through DMA emulation code. The RAM pointer may be held for arbitrary >> >> lengths of time, across main loop iterations, etc. >> >> >> >> It's not clear to me that it is safe to unplug a DIMM that has network >> >> or disk I/O buffers, for example. We also need to be robust against >> >> malicious guests who abuse the hotplug lifecycle. QEMU should never be >> >> left with dangling pointers. >> >> >> > Not sure about the block layer. But I think those thread are already >> > out of big lock, so there should be a MemoryListener to catch the >> > RAM-unplug event, and if needed, bdrv_flush. > > do we want bdrv_flush, or some kind of cancel request e.g. bdrv_aio_cancel? > My original meaning is that flush out the dangling pointer. >> >> Here is the detailed scenario: >> >> 1. Emulated device does cpu_physical_memory_map() and gets a pointer >> to guest RAM. >> 2. Return to vcpu or iothread, continue processing... >> 3. Hot unplug of RAM causes the guest RAM to disappear. >> 4. Pending I/O completes and overwrites memory from dangling guest RAM pointer. >> >> Any I/O device that does zero-copy I/O in QEMU faces this problem: >> * The block layer is affected. >> * The net layer is unaffected because it doesn't do zero-copy tx/rx >> across returns to the main loop (#2 above). >> * Not sure about other devices classes (e.g. USB). >> >> How should the MemoryListener callback work? For block I/O it may not >> be possible to cancel pending I/O asynchronously - if you try to >> cancel then your thread may block until the I/O completes. > For current code, I think to block on the listener to wait for the completion of flushing out. But after mr->ops's ref/unref patchset accept, we can release the ref of RAM device after we have done with it (it is a very raw idea, need to improve). > e.g. paio_cancel does this? > is there already an API to asynchronously cancel all in flight operations in a > BlockDriverState? Afaict block_job_cancel refers to streaming jobs only and > doesn't help here. > > Can we make the RAM unplug initiate async I/O cancellations, prevent further I/Os, > and only free the memory in a callback, after all DMA I/O to the associated memory > region has been cancelled or completed? > > Also iiuc the MemoryListener should be registered from users of > cpu_physical_memory_map e.g. hw/virtio.c > Yes. > By the way dimm_depopulate only frees the qemu memory on an ACPI _EJ request, which > means that a well-behaved guest will have already offlined the memory and is not > using it anymore. If the guest still uses the memory e.g. for a DMA buffer, the > logical memory offlining will fail and the _EJ/qemu memory freeing will never > happen. > Yes. > But in theory a malicious acpi guest driver could trigger _EJ requests to do step > 3 above. > > Or perhaps the backing block driver can finish an I/O request for a zero-copy > block device that the guest doesn't care for anymore? I 'll think about this a > bit more. > The guest is one of the users of dimm device, and block layer is another one. Regards, pingfan >> Synchronous cancel behavior is not workable since it can lead to poor >> latency or hangs in the guest. > > ok > > thanks, > > - Vasilis > -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html