On 16.05.2019 16:42, Adam Borowski wrote: > On Thu, May 16, 2019 at 04:10:07PM +0300, Kirill Tkhai wrote: >> On 15.05.2019 22:38, Adam Borowski wrote: >>> On Wed, May 15, 2019 at 06:11:15PM +0300, Kirill Tkhai wrote: >>>> This patchset adds a new syscall, which makes possible >>>> to clone a mapping from a process to another process. >>>> The syscall supplements the functionality provided >>>> by process_vm_writev() and process_vm_readv() syscalls, >>>> and it may be useful in many situation. >>>> >>>> For example, it allows to make a zero copy of data, >>>> when process_vm_writev() was previously used: >>> >>> I wonder, why not optimize the existing interfaces to do zero copy if >>> properly aligned? No need for a new syscall, and old code would immediately >>> benefit. >> >> Because, this is just not possible. You can't zero copy anonymous pages >> of a process to pages of a remote process, when they are different pages. > > fork() manages that, and so does KSM. Like KSM, you want to make a page > shared -- you just skip the comparison step as you want to overwrite the old > contents. > > And there's no need to touch the page, as fork() manages that fine no matter > if the page is resident, anonymous in swap, or file-backed, all without > reading from swap. Yes, and in case of you dive into the patchset, you will found the new syscall manages page table entries in the same way fork() makes. >>>> There are several problems with process_vm_writev() in this example: >>>> >>>> 1)it causes pagefault on remote process memory, and it forces >>>> allocation of a new page (if was not preallocated); >>>> >>>> 2)amount of memory for this example is doubled in a moment -- >>>> n pages in current and n pages in remote tasks are occupied >>>> at the same time; >>>> >>>> 3)received data has no a chance to be properly swapped for >>>> a long time. >>> >>> That'll handle all of your above problems, except for making pages >>> subject to CoW if written to. But if making pages writeably shared is >>> desired, the old functions have a "flags" argument that doesn't yet have a >>> single bit defined. > > > Meow! >