On Wed, 22 Jul 2015, Kirill A. Shutemov wrote: > On Tue, Jul 21, 2015 at 03:59:38PM -0400, Eric B Munson wrote: > > The upcoming mlock(MLOCK_ONFAULT) implementation will need a way to > > request that all present pages in a range are locked without faulting in > > pages that are not present. This logic is very close to what the > > __mm_populate() call handles without faulting pages so the patch pulls > > out the pieces that can be shared and adds mm_lock_present() to gup.c. > > The following patch will call it from do_mlock() when MLOCK_ONFAULT is > > specified. > > > > Signed-off-by: Eric B Munson <emunson@xxxxxxxxxx> > > Cc: Jonathan Corbet <corbet@xxxxxxx> > > Cc: Vlastimil Babka <vbabka@xxxxxxx> > > Cc: linux-mm@xxxxxxxxx > > Cc: linux-kernel@xxxxxxxxxxxxxxx > > --- > > mm/gup.c | 172 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++------ > > 1 file changed, 157 insertions(+), 15 deletions(-) > > I don't like that you've copy-pasted a lot of code. I think it can be > solved with new foll flags. > > Totally untested patch below split out mlock part of FOLL_POPULATE into > new FOLL_MLOCK flag. FOLL_POPULATE | FOLL_MLOCK will do what currently > FOLL_POPULATE does. The new MLOCK_ONFAULT can use just FOLL_MLOCK. It will > not trigger fault in. I originally tried to do this by adding a check for VM_LOCKONFAULT in __get_user_pages() before the call to faultin_page() which would goto next_page if LOCKONFAULT was specified. With the early out in __get_user_pages(), all of the tests using lock on fault failed to lock pages. I will try with a new FOLL flag and see if that can work out. > > diff --git a/include/linux/mm.h b/include/linux/mm.h > index c3a2b37365f6..c3834cddfcc7 100644 > --- a/include/linux/mm.h > +++ b/include/linux/mm.h > @@ -2002,6 +2002,7 @@ static inline struct page *follow_page(struct vm_area_struct *vma, > #define FOLL_NUMA 0x200 /* force NUMA hinting page fault */ > #define FOLL_MIGRATION 0x400 /* wait for page to replace migration entry */ > #define FOLL_TRIED 0x800 /* a retry, previous pass started an IO */ > +#define FOLL_MLOCK 0x1000 /* mlock the page if the VMA is VM_LOCKED */ > > typedef int (*pte_fn_t)(pte_t *pte, pgtable_t token, unsigned long addr, > void *data); > diff --git a/mm/gup.c b/mm/gup.c > index a798293fc648..4c7ff23947b9 100644 > --- a/mm/gup.c > +++ b/mm/gup.c > @@ -129,7 +129,7 @@ retry: > */ > mark_page_accessed(page); > } > - if ((flags & FOLL_POPULATE) && (vma->vm_flags & VM_LOCKED)) { > + if ((flags & FOLL_MLOCK) && (vma->vm_flags & VM_LOCKED)) { > /* > * The preliminary mapping check is mainly to avoid the > * pointless overhead of lock_page on the ZERO_PAGE > @@ -299,6 +299,9 @@ static int faultin_page(struct task_struct *tsk, struct vm_area_struct *vma, > unsigned int fault_flags = 0; > int ret; > > + /* mlock present pages, but not fault in new one */ > + if ((*flags & (FOLL_POPULATE | FOLL_MLOCK)) == FOLL_MLOCK) > + return -ENOENT; > /* For mm_populate(), just skip the stack guard page. */ > if ((*flags & FOLL_POPULATE) && > (stack_guard_page_start(vma, address) || > @@ -890,7 +893,7 @@ long populate_vma_page_range(struct vm_area_struct *vma, > VM_BUG_ON_VMA(end > vma->vm_end, vma); > VM_BUG_ON_MM(!rwsem_is_locked(&mm->mmap_sem), mm); > > - gup_flags = FOLL_TOUCH | FOLL_POPULATE; > + gup_flags = FOLL_TOUCH | FOLL_POPULATE | FOLL_MLOCK; > /* > * We want to touch writable mappings with a write fault in order > * to break COW, except for shared mappings because these don't COW > diff --git a/mm/huge_memory.c b/mm/huge_memory.c > index 8f9a334a6c66..9eeb3bd304fc 100644 > --- a/mm/huge_memory.c > +++ b/mm/huge_memory.c > @@ -1306,7 +1306,7 @@ struct page *follow_trans_huge_pmd(struct vm_area_struct *vma, > pmd, _pmd, 1)) > update_mmu_cache_pmd(vma, addr, pmd); > } > - if ((flags & FOLL_POPULATE) && (vma->vm_flags & VM_LOCKED)) { > + if ((flags & FOLL_MLOCK) && (vma->vm_flags & VM_LOCKED)) { > if (page->mapping && trylock_page(page)) { > lru_add_drain(); > if (page->mapping) > > -- > Kirill A. Shutemov
Attachment:
signature.asc
Description: Digital signature