On Wed, 9 Feb 2022, Michal Hocko wrote: > On Wed 09-02-22 08:21:17, Hugh Dickins wrote: > > On Wed, 9 Feb 2022, Michal Hocko wrote: > [...] > > > The only thing that is not entirely clear to me at the moment is why you > > > have chosen to ignore already mapped LOCKONFAULT pages. They will > > > eventually get sorted out during the reclaim/migration but this can > > > backfire if too many pages have been pre-faulted before LOCKONFAULT > > > call. Maybe not an interesting case in the first place but I am still > > > wondering why you have chosen that way. > > > > I'm puzzled: what makes you think I'm ignoring already mapped LOCKONFAULT > > pages? I'd consider that a bug. > > I've had the following path in mind > __mm_populate > populate_vma_page_range > if (vma->vm_flags & VM_LOCKONFAULT) > return nr_page > > which means that __get_user_pages is not called at all. This also means > that follow_page_mask is skipped. The page table walk used to mark > already mapped pages as mlocked so unless I miss some other path it > would stay on its original LRU list and only get real mlock protection > when encountered by the reclaim or migration. That is so until 04/13: but then, whenever a page is faulted into a VM_LOCKED area (except when skipping pte-of-compound) it goes through mlock_page(). So it will then lock-on-fault. Ah, but you're worrying about any previously-mapped pages when VM_LOCKONFAULT is established: those ones get done by mlock_pte_range() in 07/13, same as when VM_LOCKED is established - I checked again, VM_LOCKONFAULT is not set without VM_LOCKED. > > > It is the case, isn't it, that a VM_LOCKONFAULT area always has VM_LOCKED > > set too? If I've got that wrong, yes, I'll need to revisit conditions. > > Yes, I did't really mean we would lose the mlock protection. We would > just do the lazy mlock also on pages which are already mapped. This is > certainly not a correctness issue - althoug stats might be off - but it > could polute existing LRUs with mlocked pages rather easily. Even with the lazy mlock in reclaim, I'd still accuse myself of a bug if I've got this wrong. > > As I've said. Either I am still missing something or this might even not > be a big deal in real life. I was mostly curious about the choice to > exclude the page table walk for LOCKONFAULT. It surprised me too: but looking at how FOLL_POPULATE was handled in faultin_page(), it appeared to me to become redundant, once mlocking was counted at fault time (and in mlock_pte_range() - maybe that's the page table walk you're looking for, not excluded but moved). It is a big deal for me: please don't let me fool you, please do continue to worry at it until you're convinced or you convince me. Thanks, Hugh