On Thu 10-08-17 20:05:54, Andrea Arcangeli wrote: > On Thu, Aug 10, 2017 at 10:16:32AM +0200, Michal Hocko wrote: > > Andrea has proposed and alternative solution [4] which should be > > equivalent functionally similar to {ksm,khugepaged}_exit. I have to > > confess I really don't like that approach but I can live with it if > > that is a preferred way (to be honest I would like to drop the empty > > Well you added two branches, when only one is necessary. It's more or > less like preferring a rwsem when a mutex is enough, because you're > more used to use rwsems. > > > down_write();up_write() from the other two callers as well). In fact I > > have asked Andrea to post his patch [5] but that hasn't happened. I do > > not think we should wait much longer and finally merge some fix. > > It's posted in [4] already below I didn't think it was necessary to > resend it. it was deep in the email thread and I've asked you explicitly to repost which I've done for the same reason. > The only other improvement I can think of is an unlikely > around tsk_is_oom_victim() in exit_mmap, but your patch below would > need it too, and two of them. with diff --git a/mm/mmap.c b/mm/mmap.c index 822e8860b9d2..9d4a5a488f72 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -3002,7 +3002,7 @@ void exit_mmap(struct mm_struct *mm) * with tsk->mm != NULL checked on !current tasks which synchronizes * with exit_mm and so we cannot race here. */ - if (tsk_is_oom_victim(current)) { + if (unlikely(tsk_is_oom_victim(current))) { down_write(&mm->mmap_sem); locked = true; } @@ -3020,7 +3020,7 @@ void exit_mmap(struct mm_struct *mm) } mm->mmap = NULL; vm_unacct_memory(nr_accounted); - if (locked) + if (unlikely(locked)) up_write(&mm->mmap_sem); } The generated code is identical. But I do not have any objection of course. > > [1] http://lkml.kernel.org/r/20170724072332.31903-1-mhocko@xxxxxxxxxx > > [2] http://lkml.kernel.org/r/20170725142626.GJ26723@xxxxxxxxxxxxxx > > [3] http://lkml.kernel.org/r/20170725160359.GO26723@xxxxxxxxxxxxxx > > [4] http://lkml.kernel.org/r/20170726162912.GA29716@xxxxxxxxxx > > [5] http://lkml.kernel.org/r/20170728062345.GA2274@xxxxxxxxxxxxxx > > > > + if (tsk_is_oom_victim(current)) { > > + down_write(&mm->mmap_sem); > > + locked = true; > > + } > > free_pgtables(&tlb, vma, FIRST_USER_ADDRESS, USER_PGTABLES_CEILING); > > tlb_finish_mmu(&tlb, 0, -1); > > > > @@ -3005,7 +3018,10 @@ void exit_mmap(struct mm_struct *mm) > > nr_accounted += vma_pages(vma); > > vma = remove_vma(vma); > > } > > + mm->mmap = NULL; > > vm_unacct_memory(nr_accounted); > > + if (locked) > > + up_write(&mm->mmap_sem); > > I wouldn't normally repost to add an unlikely when I'm not sure if it > gets merged, but if it gets merged I would immediately tell to Andrew > about the microoptimization being missing there so he can fold it > later. > > Before reposting about the unlikely I thought we should agree which > version to merge: [4] or the above double branch (for no good as far > as I tangibly can tell). > > I think down_write;up_write is the correct thing to do here because > holding the lock for any additional instruction has zero benefits, and > if it has zero benefits it only adds up confusion and makes the code > partly senseless, and that ultimately hurts the reader when it tries > to understand why you're holding the lock for so long when it's not > needed. OK, let's agree to disagree. As I've said I like when the critical section is explicit and we _know_ what it protects. In this case it is clear that we have to protect from the page tables tear down and the vma destructions. But as I've said I am not going to argue about this more. It is more important to finally fix this. -- Michal Hocko SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>