On Fri, Dec 03, 2010 at 04:00:21PM +0100, Oleg Nesterov wrote: > On 11/30, Roland McGrath wrote: > > > > Documentation/cachetlb.txt says: > > > > Any time the kernel writes to a page cache page, _OR_ > > the kernel is about to read from a page cache page and > > user space shared/writable mappings of this page potentially > > exist, this routine is called. > > > > In your case, the kernel is only reading (write=0 passed to > > access_process_vm and get_user_pages). In normal situations, > > the page in question will have only a private and read-only > > mapping in user space. So the call should not be required in > > these cases--if the code can tell that's so. > > > > Perhaps something like the following would be safe. > > But you really need some VM folks to tell you for sure. > > > > diff --git a/mm/memory.c b/mm/memory.c > > index 02e48aa..2864ee7 100644 > > --- a/mm/memory.c > > +++ b/mm/memory.c > > @@ -1484,7 +1484,8 @@ int __get_user_pages(struct task_struct *tsk, struct mm_struct *mm, > > pages[i] = page; > > > > flush_anon_page(vma, page, start); > > - flush_dcache_page(page); > > + if ((vm_flags & VM_WRITE) || (vma->vm_flags & VM_SHARED) > > + flush_dcache_page(page); > > First of all, I know absolutely nothing about D-cache aliasing. > My poor understanding of flush_dcache_page() is: synchronize the > kernel/user vision of this memory, in the case when either side > can change it. > > If this is true, then this change doesn't look right in general. > > Even if (vma->vm_flags & VM_SHARED) == 0, it is possible that > tsk can write to this memory, this mapping can be writable and > private. > > Even if we ensure that this mapping is readonly/private, another > user-space process can write to this page via shared/writable > mapping. > I think you're right. It has a portential that other processes have a such mapping. > > I'd like to know if my understanding is correct, I am just curious. > > Oleg. How about this? Maybe this patch would mitigate the overhead. But I am not sure this patch. Cced GUP experts. >From 8fb3d84c7bb32c4ba9c4a0063198ce7cfcca6b37 Mon Sep 17 00:00:00 2001 From: Minchan Kim <minchan.kim@xxxxxxxxx> Date: Sat, 4 Dec 2010 01:19:43 +0900 Subject: [PATCH] Remove redundant flush_dcache_page in GUP If we get the page with handle_mm_fault, it already handled page flush. So GUP's flush_dcache_page call is redundant. Cc: Hugh Dickins <hughd@xxxxxxxxxx> Cc: Nick Piggin <npiggin@xxxxxxxxx> Signed-off-by: Minchan Kim <minchan.kim@xxxxxxxxx> --- mm/memory.c | 5 ++++- 1 files changed, 4 insertions(+), 1 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index ebfeedf..9166f4b 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1430,6 +1430,7 @@ int __get_user_pages(struct task_struct *tsk, struct mm_struct *mm, do { struct page *page; unsigned int foll_flags = gup_flags; + bool dcache_flushed = false; /* * If we have a pending SIGKILL, don't keep faulting @@ -1464,6 +1465,7 @@ int __get_user_pages(struct task_struct *tsk, struct mm_struct *mm, tsk->maj_flt++; else tsk->min_flt++; + dcache_flushed = true; /* * The VM_FAULT_WRITE bit tells us that @@ -1489,7 +1491,8 @@ int __get_user_pages(struct task_struct *tsk, struct mm_struct *mm, pages[i] = page; flush_anon_page(vma, page, start); - flush_dcache_page(page); + if (!dcache_flushed) + flush_dcache_page(page); } if (vmas) vmas[i] = vma; -- 1.7.0.4 > > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majordomo@xxxxxxxxxx For more info on Linux MM, > see: http://www.linux-mm.org/ . > Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/ > Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a> -- Kind regards, Minchan Kim -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxxx For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/ Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>