Re: [PATCH 0/4] mm: permit guard regions for file-backed/shmem mappings

David Hildenbrand <david@xxxxxxxxxx> · Tue, 18 Feb 2025 16:20:18 +0100

Right yeah that'd be super weird. And I don't want to add that logic.

Also not sure what happens if one does an mlock()/mlockall() after
already installing PTE markers.

The existing logic already handles non-present cases by skipping them, in
mlock_pte_range():

	for (pte = start_pte; addr != end; pte++, addr += PAGE_SIZE) {
		ptent = ptep_get(pte);
		if (!pte_present(ptent))
			continue;

		...
	}

I *think* that code only updates already-mapped folios, to properly call 
mlock_folio()/munlock_folio().

It is not the code that populates pages on mlock()/mlockall(). I think 
all that goes via mm_populate()/__mm_populate(), where "ordinary GUP" 
should apply.

See populate_vma_page_range(), especially also the VM_LOCKONFAULT handling.

Which covers off guard regions. Removing the guard regions after this will
leave you in a weird situation where these entries will be zapped... maybe
we need a patch to make MADV_GUARD_REMOVE check VM_LOCKED and in this case
also populate?

Maybe? Or we say that it behaves like MADV_DONTNEED_LOCKED.

Actually I think the simpler option is to just disallow MADV_GUARD_REMOVE
if you since locked the range? The code currently allows this on the
proviso that 'you aren't zapping locked mappings' but leaves the VMA in a
state such that some entries would not be locked.

It'd be pretty weird to lock guard regions like this.

Having said all that, given what you say below, maybe it's not an issue
after all?...

__mm_populate() would skip whole VMAs in case populate_vma_page_range()
fails. And I would assume populate_vma_page_range() fails on the first
guard when it triggers a page fault.

OTOH, supporting the mlock-on-fault thingy should be easy. That's precisely where
MADV_DONTNEED_LOCKED originates from:

commit 9457056ac426e5ed0671356509c8dcce69f8dee0
Author: Johannes Weiner <hannes@xxxxxxxxxxx>
Date:   Thu Mar 24 18:14:12 2022 -0700

     mm: madvise: MADV_DONTNEED_LOCKED
     MADV_DONTNEED historically rejects mlocked ranges, but with MLOCK_ONFAULT
     and MCL_ONFAULT allowing to mlock without populating, there are valid use
     cases for depopulating locked ranges as well.

...Hm this seems to imply the current guard remove stuff isn't quite so
bad, so maybe the assumption that VM_LOCKED implies 'everything is
populated' isn't quite as stringent then.

Right, with MCL_ONFAULT at least. Without MCL_ONFAULT, the assumption is 
that everything is populated (unless, apparently one uses 
MADV_DONTNEED_LOCKED or population failed, maybe).

VM_LOCKONFAULT seems to be the sane case. I wonder why 
MADV_DONTNEED_LOCKED didn't explicitly check for that one ... maybe 
there is a history to that.

The restriction is as simple as:

		if (behavior != MADV_DONTNEED_LOCKED)
			forbidden |= VM_LOCKED;

Adding support for that would be indeed nice.

I mean it's sort of maybe understandable why you'd want to MADV_DONTNEED
locked ranges, but I really don't understand why you'd want to add guard
regions to mlock()'ed regions?

Somme apps use mlockall(), and it might be nice to just be able to use 
guard pages as if "Nothing happened".

E.g., QEMU has the option to use mlockall().

Then again we're currently asymmetric as you can add them _before_
mlock()'ing...

Right.

--
Cheers,

David / dhildenb