Re: [PATCH v21 1/4] mm: add VM_DROPPABLE for designating always lazily freeable mappings

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, 7 Jul 2024 at 00:42, David Hildenbrand <david@xxxxxxxxxx> wrote:
>
> But I don't immediately see why MAP_WIPEONFORK and MAP_DONTDUMP have to
> be mmap() flags.

I don't think they have to be mmap() flags, but that said, I think
it's technically the better alternative than saying "you have to
madvise things later".

I very much understand the "we don't have a lot of MAP_xyz flags and
we don't want to waste them" argument, but at the same time

 (a) we _do_ have those flags

 (b) picking a worse interface seems bad

 (c) we could actually use the PROT_xyz bits, which we have a ton of

And yes, (c) is ugly, but is it uglier than "use two system calls to
do one thing"? I mean, "flags" and "prot" are just two sides of the
same coin in the end, the split is kind of arbitrary, and "prot" only
has four bits right now, and one of them is historical and useless,
and actually happens to be *exactly* this kind of MAP_xyz bit.

(In case it's not clear, I'm talking about PROT_SEM, which is very
much a behavioral bit for broken architectures that we've actually
never implemented).

We also have PROT_GROSDOWN and PROT_GROWSUP , which is basically a
"match MAP_GROWSxyz and change the mprotect() limits appropriately"

So I actually think we could use the PROT_xyz bits, and anybody who
says "those are for PROT_READ and PROT_WRITE is already very very
wrong.

Again - not pretty, but mappens to match reality.

> Interestingly, when looking into something comparable in the past I
> stumbled over "vrange" [1], which would have had a slightly different
> semantic (signal on reaccess).

We literally talked about exactly this with Jason, except unlike you I
couldn't find the historical archive (I tried in vain to find
something from lore).

  https://lore.kernel.org/lkml/CAHk-=whRpLyY+U9mkKo8O=2_BXNk=7sjYeObzFr3fGi0KLjLJw@xxxxxxxxxxxxxx/

I do think that a "explicit populate and get a signal on access" is a
very valid model, but I think the "zero on access" is a more
immediately real model.

And we actually have had the "get signal on access" before: that's
what VM_DONTCOPY is.

And it was the *less* useful model, which is why we added
VM_WIPEONCOPY, because that's the semantics people actually wanted.

So I think the "signal on thrown out data access" is interesting, but
not necessarily the *more* interesting case.

And I think if we do want that case, I think having MAP_DROPPABLE have
those semantics for MAP_SHARED would be the way to go. IOW, starting
off with the "zero on next access after drop" case doesn't make it any
harder to then later add a "fault on next access after drop" version.

> There needs to be better reasoning why we have to consume three mmap
> bits for something that can likely be achieved without any.

I think it goes the other way: why are MAP_xyz bits so precious to
make this harder to actually use?

Together with that whole "maybe use PROT_xyz bits instead" discussion?

               Linus




[Index of Archives]     [Kernel]     [Gnu Classpath]     [Gnu Crypto]     [DM Crypt]     [Netfilter]     [Bugtraq]
  Powered by Linux