On Tue, Oct 05, 2010 at 03:44:22PM -0700, Michel Lespinasse wrote: > On Tue, Oct 5, 2010 at 10:38 AM, Rik van Riel <riel@xxxxxxxxxx> wrote: > > Looks like it should be relatively easy to do something > > similar in do_swap_page also. > > Good idea. We don't make use of swap too much, which is probably why > we didn't have that in our kernel, but it seems like a good idea just > for uniformity. I'll add this in a follow-on patch. So here's the patch. Sorry for the delay - it did not take long to write, but I couldn't test it before today. Please have a look - I'd like to add this to the series I sent earlier. ----------------------------------- 8< --------------------------------- Retry page fault when blocking on swap in This change is the cousin of 'Retry page fault when blocking on disk transfer'. The idea here is to reduce mmap_sem hold times that are caused by disk transfers when swapping in pages. We drop mmap_sem while waiting for the page lock, and return the VM_FAULT_RETRY flag. do_page_fault will then re-acquire mmap_sem and retry the page fault. It is expected that upon retry the page will now be cached, and thus the retry will complete with a low mmap_sem hold time. Signed-off-by: Michel Lespinasse <walken@xxxxxxxxxx> diff --git a/mm/memory.c b/mm/memory.c index b068c68..0ec70b4 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -2613,6 +2613,21 @@ int vmtruncate_range(struct inode *inode, loff_t offset, loff_t end) return 0; } +static inline int lock_page_or_retry(struct page *page, struct mm_struct *mm, + unsigned int flags) +{ + if (trylock_page(page)) + return 1; + if (!(flags & FAULT_FLAG_ALLOW_RETRY)) { + __lock_page(page); + return 1; + } + + up_read(&mm->mmap_sem); + wait_on_page_locked(page); + return 0; +} + /* * We enter with non-exclusive mmap_sem (to exclude vma changes, * but allow concurrent faults), and pte mapped but not yet locked. @@ -2626,6 +2641,7 @@ static int do_swap_page(struct mm_struct *mm, struct vm_area_struct *vma, struct page *page, *swapcache = NULL; swp_entry_t entry; pte_t pte; + int locked; struct mem_cgroup *ptr = NULL; int exclusive = 0; int ret = 0; @@ -2676,8 +2692,12 @@ static int do_swap_page(struct mm_struct *mm, struct vm_area_struct *vma, goto out_release; } - lock_page(page); + locked = lock_page_or_retry(page, mm, flags); delayacct_clear_flag(DELAYACCT_PF_SWAPIN); + if (!locked) { + ret |= VM_FAULT_RETRY; + goto out_release; + } /* * Make sure try_to_free_swap or reuse_swap_page or swapoff did not -- Michel "Walken" Lespinasse A program is never fully debugged until the last user dies. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxxx For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>