Re: [RFC PATCH V1 0/6] mm: add a new option MREMAP_DUP to mmrep syscall

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Dec 31, 2013, at 4:23 AM, Marcelo Tosatti <mtosatti@xxxxxxxxxx> wrote:

> On Tue, Dec 17, 2013 at 01:59:04PM +0800, Xiao Guangrong wrote:
>> 
>> CCed KVM guys.
>> 
>> On 05/10/2013 01:11 PM, Stefan Hajnoczi wrote:
>>> On Fri, May 10, 2013 at 4:28 AM, wenchao <wenchaolinux@xxxxxxxxx> wrote:
>>>> 于 2013-5-9 22:13, Mel Gorman 写道:
>>>> 
>>>>> On Thu, May 09, 2013 at 05:50:05PM +0800, wenchaolinux@xxxxxxxxx wrote:
>>>>>> 
>>>>>> From: Wenchao Xia <wenchaolinux@xxxxxxxxx>
>>>>>> 
>>>>>>  This serial try to enable mremap syscall to cow some private memory
>>>>>> region,
>>>>>> just like what fork() did. As a result, user space application would got
>>>>>> a
>>>>>> mirror of those region, and it can be used as a snapshot for further
>>>>>> processing.
>>>>>> 
>>>>> 
>>>>> What not just fork()? Even if the application was threaded it should be
>>>>> managable to handle fork just for processing the private memory region
>>>>> in question. I'm having trouble figuring out what sort of application
>>>>> would require an interface like this.
>>>>> 
>>>> It have some troubles: parent - child communication, sometimes
>>>> page copy.
>>>> I'd like to snapshot qemu guest's RAM, currently solution is:
>>>> 1) fork()
>>>> 2) pipe guest RAM data from child to parent.
>>>> 3) parent write down the contents.
>>>> 
>>>> To avoid complex communication for data control, and file content
>>>> protecting, So let parent instead of child handling the data with
>>>> a pipe, but this brings additional copy(). I think an explicit API
>>>> cow mapping an memory region inside one process, could avoid it,
>>>> and faster and cow less pages, also make user space code nicer.
>>> 
>>> A new Linux-specific API is not portable and not available on existing
>>> hosts.  Since QEMU supports non-Linux host operating systems the
>>> fork() approach is preferable.
>>> 
>>> If you're worried about the memory copy - which should be benchmarked
>>> - then vmsplice(2) can be used in the child process and splice(2) can
>>> be used in the parent.  It probably doesn't help though since QEMU
>>> scans RAM pages to find all-zero pages before sending them over the
>>> socket, and at that point the memory copy might not make much
>>> difference.
>>> 
>>> Perhaps other applications can use this new flag better, but for QEMU
>>> I think fork()'s portability is more important than the convenience of
>>> accessing the CoW pages in the same process.
>> 
>> Yup, I agree with you that the new syscall sometimes is not a good solution.
>> 
>> Currently, we're working on live-update[1] that will be enabled on Qemu firstly,
>> this feature let the guest run on the new Qemu binary smoothly without
>> restart, it's good for us to do security-update.
>> 
>> In this case, we need to move the guest memory on old qemu instance to the
>> new one, fork() can not help because we need to exec() a new instance, after
>> that all memory mapping will be destroyed.
>> 
>> We tried to enable SPLICE_F_MOVE[2] for vmsplice() to move the memory without
>> memory-copy but the performance isn't so good as we expected: it's due to
>> some limitations: the page-size, lock, message-size limitation on pipe, etc.
>> Of course, we will continue to improve this, but wenchao's patch seems a new
>> direction for us.
>> 
>> To coordinate with your fork() approach, maybe we can introduce a new flag
>> for VMA, something like: VM_KEEP_ONEXEC, to tell exec() to do not destroy
>> this VMA. How about this or you guy have new idea? Really appreciate for your
>> suggestion.
>> 
>> [1] http://marc.info/?l=qemu-devel&m=138597598700844&w=2
>> [2] https://lkml.org/lkml/2013/10/25/285
> 
> Hi,
> 

Hi Marcelo,


> What is the purpose of snapshotting guest RAM here, in the context of
> local migration?

RAM-shapshotting and local-migration are on the different ways.
Why i asked for your guy’s suggestion here is  beacuse i  thought
they need do a same thing that moves memory from one process
to another in a efficient way. Your idea? :)


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux