On Tue, Jul 03, 2018 at 10:27:18AM +0200, Michal Hocko wrote: > On Tue 03-07-18 11:12:05, Kirill A. Shutemov wrote: > > On Mon, Jul 02, 2018 at 02:49:28PM +0200, Michal Hocko wrote: > > > On Mon 02-07-18 15:33:50, Kirill A. Shutemov wrote: > > > [...] > > > > I probably miss the explanation somewhere, but what's wrong with allowing > > > > other thread to re-populate the VMA? > > > > > > We have discussed that earlier and it boils down to how is racy access > > > to munmap supposed to behave. Right now we have either the original > > > content or SEGV. If we allow to simply madvise_dontneed before real > > > unmap we could get a new page as well. There might be (quite broken I > > > would say) user space code that would simply corrupt data silently that > > > way. > > > > Okay, so we add a lot of complexity to accommodate broken userspace that > > may or may not exist. Is it right? :) > > I would really love to do the most simple and obious thing > > diff --git a/mm/mmap.c b/mm/mmap.c > index 336bee8c4e25..86ffb179c3b5 100644 > --- a/mm/mmap.c > +++ b/mm/mmap.c > @@ -2811,6 +2811,8 @@ EXPORT_SYMBOL(vm_munmap); > SYSCALL_DEFINE2(munmap, unsigned long, addr, size_t, len) > { > profile_munmap(addr); > + if (len > LARGE_NUMBER) > + do_madvise(addr, len, MADV_DONTNEED); > return vm_munmap(addr, len); > } > > but the argument that current semantic of good data or SEGV on > racing threads is no longer preserved sounds valid to me. Remember > optimizations shouldn't eat your data. How do we ensure that we won't > corrupt data silently? +linux-api Frankly, I don't see change in semantics here. Code that has race between munmap() and page fault would get intermittent SIGSEGV before and after the approach with simple MADV_DONTNEED. To be safe, I wouldn't go with the optimization if the process has custom SIGSEGV handler. > Besides that if this was so simple then we do not even need any kernel > code. You could do that from glibc resp. any munmap wrapper. So maybe > the proper answer is, if you do care then just help the system and > DONTNEED your data before you munmap as an optimization for large > mappings. Kernel latency problems have to be handled by kernel. -- Kirill A. Shutemov