Hi, On Sat, Oct 04, 2014 at 06:13:27AM -0700, Andi Kleen wrote: > Andrea Arcangeli <aarcange@xxxxxxxxxx> writes: > > > This new syscall will move anon pages across vmas, atomically and > > without touching the vmas. > > > > It only works on non shared anonymous pages because those can be > > relocated without generating non linear anon_vmas in the rmap code. > > ... > > > It is an alternative to mremap. > > Why a new syscall? Couldn't mremap do this transparently? The difference between remap_anon_pages and mremap is that mremap fundamentally moves vmas and not pages (just the pages are moved too because they remain attached to their respective vmas), while remap_anon_pages move anonymous pages zerocopy across vmas but it would never touch any vma. mremap for example would also nuke the source vma, remap_anon_pages just moves the pages inside the vmas instead so it doesn't require to allocate new vmas in the area that receives the data. We could certainly change mremap to try to detect when page_mapping of anonymous page is 1 and downgrade the mmap_sem to down_read and then behave like remap_anon_pages internally by updating the page->index if all pages in the range can be updated. However to provide the same strict checks that remap_anon_pages does and to leave the source vma intact, mremap would need new flags that would need to alter the normal mremap semantics that silently wipes out the destination range and get rid of the source range and it would require to run a remap_anon_pages-detection-routine that isn't zero cost. Unless we add even more flags to mremap, we wouldn't have the absolute guarantee that the vma tree is not altered in case userland is not doing all things right (like if userland forgot MADV_DONTFORK). Separating the two looked better, mremap was never meant to be efficient at moving 1 page at time (or 1 THP at time). Embedding remap_anon_pages inside mremap didn't look worthwhile considering that as result, mremap would run slower when it cannot behave like remap_anon_pages and it would also run slower than remap_anon_pages when it could. Thanks, Andrea -- To unsubscribe from this list: send the line "unsubscribe linux-api" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html