Re: [PATCH 00/10] RFC: userfault (question about remap_anon_pages API)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Andrea,

We'd like to use this code to implement the post-copy migration
too, but this time for containers, not for virtual machines. This
will be done as a part of the CRIU [1] project.

>From our experiments almost everything is suitable, but the
remap_anon_pages() system call, so I'd like you to comment on
whether we're mis-using your API or not :) So, for containers the
post-copy migration would look like this.


On the source node we freeze the container's process tree, read
its state, except for the memory contents using CRIU tool, then
copy the state on remote host and recreate the processes back
using the CRIU tool again.

At this step (restore) we mark all the memory of the tasks we
restore with MADV_USERFAULT so that any attempt to access one 
results in the notification via userfaultfd. The userfaultfd, in
turn, exists for every process in the container and, in our plans, 
is owned by the CRIU daemon, that will provide the post-copy 
memory updates. Then we unfreeze the processes and let them run
further.

So, when a process tries to access the memory the CRIU daemon
wakes up, reads the fault address, pulls the page from source node
and then it should put this page into the proper process' address
space. And here's where we have problems.

The page with data is in CRIU daemon address space and the syscall
remap_anon_pages() works on current process address space. So, in
order to have the data in the container's process address space, we
have two choices. Either we somehow make the page be available in 
the other process address space and make this process call the remap
system call, or we should extend the syscall to accept the pid of 
the process on whose address space we'd like to work on.


What do you think? Are you OK with tuning the remap_anon_pages, or
we should do things in completely different way? If the above
explanation is not clear enough, we'd be happy to provide more 
details.

Thanks,
Pavel

[1] http://criu.org

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]