On Fri, Aug 21, 2015 at 9:31 PM, Eric B Munson <emunson@xxxxxxxxxx> wrote: > On Fri, 21 Aug 2015, Michal Hocko wrote: > >> On Thu 20-08-15 13:03:09, Eric B Munson wrote: >> > On Thu, 20 Aug 2015, Michal Hocko wrote: >> > >> > > On Wed 19-08-15 17:33:45, Eric B Munson wrote: >> > > [...] >> > > > The group which asked for this feature here >> > > > wants the ability to distinguish between LOCKED and LOCKONFAULT regions >> > > > and without the VMA flag there isn't a way to do that. >> > > >> > > Could you be more specific on why this is needed? >> > >> > They want to keep metrics on the amount of memory used in a LOCKONFAULT >> > region versus the address space of the region. >> >> /proc/<pid>/smaps already exports that information AFAICS. It exports >> VMA flags including VM_LOCKED and if rss < size then this is clearly >> LOCKONFAULT because the standard mlock semantic is to populate. Would >> that be sufficient? >> >> Now, it is true that LOCKONFAULT wouldn't be distinguishable from >> MAP_LOCKED which failed to populate but does that really matter? It is >> LOCKONFAULT in a way as well. > > Does that matter to my users? No, they do not use MAP_LOCKED at all so > any VMA with VM_LOCKED set and rss < size is lock on fault. Will it > matter to others? I suspect so, but these are likely to be the same > group of users which will be suprised to learn that MAP_LOCKED does not > guarantee that the entire range is faulted in on return from mmap. > >> >> > > > Do we know that these last two open flags are needed right now or is >> > > > this speculation that they will be and that none of the other VMA flags >> > > > can be reclaimed? >> > > >> > > I do not think they are needed by anybody right now but that is not a >> > > reason why it should be used without a really strong justification. >> > > If the discoverability is really needed then fair enough but I haven't >> > > seen any justification for that yet. >> > >> > To be completely clear you believe that if the metrics collection is >> > not a strong enough justification, it is better to expand the mm_struct >> > by another unsigned long than to use one of these bits right? >> >> A simple bool is sufficient for that. And yes I think we should go with >> per mm_struct flag rather than the additional vma flag if it has only >> the global (whole address space) scope - which would be the case if the >> LOCKONFAULT is always an mlock modifier and the persistance is needed >> only for MCL_FUTURE. Which is imho a sane semantic. > > I am in the middle of implementing lock on fault this way, but I cannot > see how we will hanlde mremap of a lock on fault region. Say we have > the following: > > addr = mmap(len, MAP_ANONYMOUS, ...); > mlock(addr, len, MLOCK_ONFAULT); > ... > mremap(addr, len, 2 * len, ...) > > There is no way for mremap to know that the area being remapped was lock > on fault so it will be locked and prefaulted by remap. How can we avoid > this without tracking per vma if it was locked with lock or lock on > fault? remap can count filled ptes and prefault only completely populated areas. There might be a problem after failed populate: remap will handle them as lock on fault. In this case we can fill ptes with swap-like non-present entries to remember that fact and count them as should-be-locked pages. -- To unsubscribe from this list: send the line "unsubscribe linux-api" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html