On Thu, Dec 29, 2011 at 01:24:32PM +0200, Avi Kivity wrote: > On 12/29/2011 03:26 AM, Isaku Yamahata wrote: > > This is Linux kernel driver for qemu/kvm postcopy live migration. > > This is used by qemu/kvm postcopy live migration patch. > > > > TODO: > > - Consider FUSE/CUSE option > > So far several mmap patches for FUSE/CUSE are floating around. (their > > purpose isn't different from our purpose, though). They haven't merged > > into the upstream yet. > > The driver specific part in qemu patches is modularized. So I expect it > > wouldn't be difficult to switch kernel driver to CUSE based driver. > > It would be good to get more input about this, please involve lkml and > the FUSE/CUSE people. Okay. > > ioctl commands: > > > > UMEM_DEV_CRATE_UMEM: create umem device for qemu > > UMEM_DEV_LIST: list created umem devices > > UMEM_DEV_REATTACH: re-attach the created umem device > > UMEM_DEV_LIST and UMEM_DEV_REATTACH are used when > > the process that services page fault disappears or get stack. > > Then, administrator can list the umem devices and unblock > > the process which is waiting for page. > > Ah, I asked about this in my patch comments. I think this is done > better by using SCM_RIGHTS to pass fds along, or asking qemu to launch a > new process. Can you please elaborate? I think those ways you are suggesting doesn't solve the issue. Let me clarify the problem. process A (typically incoming qemu) | | mmap("/dev/umem") and access those pages triggering page faults | (the file descriptor might be closed after mmap() before page faults) | V /dev/umem ^ | | daemon X resolving page faults triggered by process A (typically this daemon forked from incoming qemu:process A) If daemon X disappears accidentally, there is no one that resolves page faults of process A. At this moment process A is blocked due to page fault. There is no file descriptor available corresponding to the VMA. Here there is no way to kill process A, but system reboot. > Introducing a global namespace has a lot of complications attached. > > > > > UMEM_GET_PAGE_REQUEST: retrieve page fault of qemu process > > UMEM_MARK_PAGE_CACHED: mark the specified pages pulled from the source > > for daemon > > > > UMEM_MAKE_VMA_ANONYMOUS: make the specified vma in the qemu process > > This is _NOT_ implemented yet. > > anonymous I'm not sure whether this can be implemented > > or not. > > How do we find out? This is fairly important, stuff like transparent > hugepages and ksm only works on anonymous memory. I agree that this is important. At KVM-forum 2011, Andrea said THP and KSM works with non-anonymous VMA. (Or at lease he'll look into those stuff. My memory is vague, though. Please correct me if I'm wrong) -- yamahata -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html