On Tue, Feb 18, 2025 at 03:35:19PM +0100, David Hildenbrand wrote: > On 18.02.25 14:05, Lorenzo Stoakes wrote: > > On Tue, Feb 18, 2025 at 01:12:05PM +0100, Vlastimil Babka wrote: > > > On 2/13/25 19:16, Lorenzo Stoakes wrote: > > > > The guard regions feature was initially implemented to support anonymous > > > > mappings only, excluding shmem. > > > > > > > > This was done such as to introduce the feature carefully and incrementally > > > > and to be conservative when considering the various caveats and corner > > > > cases that are applicable to file-backed mappings but not to anonymous > > > > ones. > > > > > > > > Now this feature has landed in 6.13, it is time to revisit this and to > > > > extend this functionality to file-backed and shmem mappings. > > > > > > > > In order to make this maximally useful, and since one may map file-backed > > > > mappings read-only (for instance ELF images), we also remove the > > > > restriction on read-only mappings and permit the establishment of guard > > > > regions in any non-hugetlb, non-mlock()'d mapping. > > > > > > Do you plan to address mlock later too? I guess somebody might find use for > > > those. Is there some fundamental issue or just that we need to define some > > > good semantics for corner cases? (i.e. if pages are already populated in the > > > mlocked area, discarding them by replacing with guard pte's goes against > > > that, so do we allow it or not?). > > > > Yeah that's the fundamental issue with mlock, it does not interact with the > > zapping part of MADV_GUARD_INSTALL, and that is why we disallow it (but not so > > for MADV_GUARD_REMOVE, as if a VMA that contains guard regions is locked > > _afterwards_ there will be no zapping). > > > > We could potentially expose an equivalent, as there are for other flags, a > > _LOCKED variant of the madvise() flag, like MADV_GUARD_INSTALL_LOCKED to make > > this explicit. > > > > That is probably the most sensible option, if there is a need for this! > > mlock is weird, because it assumes that memory will be faulted in in the whole VMA. > > You'd likely have to populate + mlock the page when removing the guard. Right yeah that'd be super weird. And I don't want to add that logic. > Also not sure what happens if one does an mlock()/mlockall() after > already installing PTE markers. The existing logic already handles non-present cases by skipping them, in mlock_pte_range(): for (pte = start_pte; addr != end; pte++, addr += PAGE_SIZE) { ptent = ptep_get(pte); if (!pte_present(ptent)) continue; ... } Which covers off guard regions. Removing the guard regions after this will leave you in a weird situation where these entries will be zapped... maybe we need a patch to make MADV_GUARD_REMOVE check VM_LOCKED and in this case also populate? Actually I think the simpler option is to just disallow MADV_GUARD_REMOVE if you since locked the range? The code currently allows this on the proviso that 'you aren't zapping locked mappings' but leaves the VMA in a state such that some entries would not be locked. It'd be pretty weird to lock guard regions like this. Having said all that, given what you say below, maybe it's not an issue after all?... > > __mm_populate() would skip whole VMAs in case populate_vma_page_range() > fails. And I would assume populate_vma_page_range() fails on the first > guard when it triggers a page fault. > > OTOH, supporting the mlock-on-fault thingy should be easy. That's precisely where > MADV_DONTNEED_LOCKED originates from: > > commit 9457056ac426e5ed0671356509c8dcce69f8dee0 > Author: Johannes Weiner <hannes@xxxxxxxxxxx> > Date: Thu Mar 24 18:14:12 2022 -0700 > > mm: madvise: MADV_DONTNEED_LOCKED > MADV_DONTNEED historically rejects mlocked ranges, but with MLOCK_ONFAULT > and MCL_ONFAULT allowing to mlock without populating, there are valid use > cases for depopulating locked ranges as well. ...Hm this seems to imply the current guard remove stuff isn't quite so bad, so maybe the assumption that VM_LOCKED implies 'everything is populated' isn't quite as stringent then. The restriction is as simple as: if (behavior != MADV_DONTNEED_LOCKED) forbidden |= VM_LOCKED; > > > Adding support for that would be indeed nice. I mean it's sort of maybe understandable why you'd want to MADV_DONTNEED locked ranges, but I really don't understand why you'd want to add guard regions to mlock()'ed regions? Then again we're currently asymmetric as you can add them _before_ mlock()'ing... > > -- > Cheers, > > David / dhildenb >