On Tue, Dec 31, 2013 at 08:06:51PM +0800, Xiao Guangrong wrote: > > On Dec 31, 2013, at 4:23 AM, Marcelo Tosatti <mtosatti@xxxxxxxxxx> wrote: > > > On Tue, Dec 17, 2013 at 01:59:04PM +0800, Xiao Guangrong wrote: > >> > >> CCed KVM guys. > >> > >> On 05/10/2013 01:11 PM, Stefan Hajnoczi wrote: > >>> On Fri, May 10, 2013 at 4:28 AM, wenchao <wenchaolinux@xxxxxxxxx> wrote: > >>>> 于 2013-5-9 22:13, Mel Gorman 写道: > >>>> > >>>>> On Thu, May 09, 2013 at 05:50:05PM +0800, wenchaolinux@xxxxxxxxx wrote: > >>>>>> > >>>>>> From: Wenchao Xia <wenchaolinux@xxxxxxxxx> > >>>>>> > >>>>>> This serial try to enable mremap syscall to cow some private memory > >>>>>> region, > >>>>>> just like what fork() did. As a result, user space application would got > >>>>>> a > >>>>>> mirror of those region, and it can be used as a snapshot for further > >>>>>> processing. > >>>>>> > >>>>> > >>>>> What not just fork()? Even if the application was threaded it should be > >>>>> managable to handle fork just for processing the private memory region > >>>>> in question. I'm having trouble figuring out what sort of application > >>>>> would require an interface like this. > >>>>> > >>>> It have some troubles: parent - child communication, sometimes > >>>> page copy. > >>>> I'd like to snapshot qemu guest's RAM, currently solution is: > >>>> 1) fork() > >>>> 2) pipe guest RAM data from child to parent. > >>>> 3) parent write down the contents. > >>>> > >>>> To avoid complex communication for data control, and file content > >>>> protecting, So let parent instead of child handling the data with > >>>> a pipe, but this brings additional copy(). I think an explicit API > >>>> cow mapping an memory region inside one process, could avoid it, > >>>> and faster and cow less pages, also make user space code nicer. > >>> > >>> A new Linux-specific API is not portable and not available on existing > >>> hosts. Since QEMU supports non-Linux host operating systems the > >>> fork() approach is preferable. > >>> > >>> If you're worried about the memory copy - which should be benchmarked > >>> - then vmsplice(2) can be used in the child process and splice(2) can > >>> be used in the parent. It probably doesn't help though since QEMU > >>> scans RAM pages to find all-zero pages before sending them over the > >>> socket, and at that point the memory copy might not make much > >>> difference. > >>> > >>> Perhaps other applications can use this new flag better, but for QEMU > >>> I think fork()'s portability is more important than the convenience of > >>> accessing the CoW pages in the same process. > >> > >> Yup, I agree with you that the new syscall sometimes is not a good solution. > >> > >> Currently, we're working on live-update[1] that will be enabled on Qemu firstly, > >> this feature let the guest run on the new Qemu binary smoothly without > >> restart, it's good for us to do security-update. > >> > >> In this case, we need to move the guest memory on old qemu instance to the > >> new one, fork() can not help because we need to exec() a new instance, after > >> that all memory mapping will be destroyed. > >> > >> We tried to enable SPLICE_F_MOVE[2] for vmsplice() to move the memory without > >> memory-copy but the performance isn't so good as we expected: it's due to > >> some limitations: the page-size, lock, message-size limitation on pipe, etc. > >> Of course, we will continue to improve this, but wenchao's patch seems a new > >> direction for us. > >> > >> To coordinate with your fork() approach, maybe we can introduce a new flag > >> for VMA, something like: VM_KEEP_ONEXEC, to tell exec() to do not destroy > >> this VMA. How about this or you guy have new idea? Really appreciate for your > >> suggestion. > >> > >> [1] http://marc.info/?l=qemu-devel&m=138597598700844&w=2 > >> [2] https://lkml.org/lkml/2013/10/25/285 > > > > Hi, > > > > Hi Marcelo, > > > > What is the purpose of snapshotting guest RAM here, in the context of > > local migration? > > RAM-shapshotting and local-migration are on the different ways. > Why i asked for your guy’s suggestion here is beacuse i thought > they need do a same thing that moves memory from one process > to another in a efficient way. Your idea? :) Another possibility is to use memory that is not anonymous for guest RAM, such as hugetlbfs or tmpfs. IIRC ksm and thp have limitations wrt tmpfs. Still curious about RAM snapshotting. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html