Hi, On Wed, Oct 24, 2012 at 12:15:17PM +0200, Stefan Hajnoczi wrote: > On Wed, Oct 24, 2012 at 10:06 AM, liu ping fan <qemulist@xxxxxxxxx> wrote: > > On Tue, Oct 23, 2012 at 8:25 PM, Stefan Hajnoczi <stefanha@xxxxxxxxx> wrote: > >> On Fri, Sep 21, 2012 at 01:17:21PM +0200, Vasilis Liaskovitis wrote: > >>> +static void dimm_populate(DimmDevice *s) > >>> +{ > >>> + DeviceState *dev= (DeviceState*)s; > >>> + MemoryRegion *new = NULL; > >>> + > >>> + new = g_malloc(sizeof(MemoryRegion)); > >>> + memory_region_init_ram(new, dev->id, s->size); > >>> + vmstate_register_ram_global(new); > >>> + memory_region_add_subregion(get_system_memory(), s->start, new); > >>> + s->mr = new; > >>> +} > >>> + > >>> +static void dimm_depopulate(DimmDevice *s) > >>> +{ > >>> + assert(s); > >>> + vmstate_unregister_ram(s->mr, NULL); > >>> + memory_region_del_subregion(get_system_memory(), s->mr); > >>> + memory_region_destroy(s->mr); > >>> + s->mr = NULL; > >>> +} > >> > >> How is dimm hot unplug protected against callers who currently have RAM > >> mapped (from cpu_physical_memory_map())? > >> > >> Emulated devices call cpu_physical_memory_map() directly or indirectly > >> through DMA emulation code. The RAM pointer may be held for arbitrary > >> lengths of time, across main loop iterations, etc. > >> > >> It's not clear to me that it is safe to unplug a DIMM that has network > >> or disk I/O buffers, for example. We also need to be robust against > >> malicious guests who abuse the hotplug lifecycle. QEMU should never be > >> left with dangling pointers. > >> > > Not sure about the block layer. But I think those thread are already > > out of big lock, so there should be a MemoryListener to catch the > > RAM-unplug event, and if needed, bdrv_flush. do we want bdrv_flush, or some kind of cancel request e.g. bdrv_aio_cancel? > > Here is the detailed scenario: > > 1. Emulated device does cpu_physical_memory_map() and gets a pointer > to guest RAM. > 2. Return to vcpu or iothread, continue processing... > 3. Hot unplug of RAM causes the guest RAM to disappear. > 4. Pending I/O completes and overwrites memory from dangling guest RAM pointer. > > Any I/O device that does zero-copy I/O in QEMU faces this problem: > * The block layer is affected. > * The net layer is unaffected because it doesn't do zero-copy tx/rx > across returns to the main loop (#2 above). > * Not sure about other devices classes (e.g. USB). > > How should the MemoryListener callback work? For block I/O it may not > be possible to cancel pending I/O asynchronously - if you try to > cancel then your thread may block until the I/O completes. e.g. paio_cancel does this? is there already an API to asynchronously cancel all in flight operations in a BlockDriverState? Afaict block_job_cancel refers to streaming jobs only and doesn't help here. Can we make the RAM unplug initiate async I/O cancellations, prevent further I/Os, and only free the memory in a callback, after all DMA I/O to the associated memory region has been cancelled or completed? Also iiuc the MemoryListener should be registered from users of cpu_physical_memory_map e.g. hw/virtio.c By the way dimm_depopulate only frees the qemu memory on an ACPI _EJ request, which means that a well-behaved guest will have already offlined the memory and is not using it anymore. If the guest still uses the memory e.g. for a DMA buffer, the logical memory offlining will fail and the _EJ/qemu memory freeing will never happen. But in theory a malicious acpi guest driver could trigger _EJ requests to do step 3 above. Or perhaps the backing block driver can finish an I/O request for a zero-copy block device that the guest doesn't care for anymore? I 'll think about this a bit more. > Synchronous cancel behavior is not workable since it can lead to poor > latency or hangs in the guest. ok thanks, - Vasilis -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html