Liam R. Howlett <Liam.Howlett@xxxxxxxxxx> wrote: > * Theo de Raadt <deraadt@xxxxxxxxxxx> [240122 17:35]: > > Jeff Xu <jeffxu@xxxxxxxxxxxx> wrote: > > > > > On Mon, Jan 22, 2024 at 7:49 AM Theo de Raadt <deraadt@xxxxxxxxxxx> wrote: > > > > > > > > Regarding these pieces > > > > > > > > > The PROT_SEAL bit in prot field of mmap(). When present, it marks > > > > > the map sealed since creation. > > > > > > > > OpenBSD won't be doing this. I had PROT_IMMUTABLE as a draft. In my > > > > research I found basically zero circumstances when you userland does > > > > that. The most common circumstance is you create a RW mapping, fill it, > > > > and then change to a more restrictve mapping, and lock it. > > > > > > > > There are a few regions in the addressspace that can be locked while RW. > > > > For instance, the stack. But the kernel does that, not userland. I > > > > found regions where the kernel wants to do this to the address space, > > > > but there is no need to export useless functionality to userland. > > > > > > > I have a feeling that most apps that need to use mmap() in their code > > > are likely using RW mappings. Adding sealing to mmap() could stop > > > those mappings from being executable. Of course, those apps would > > > need to change their code. We can't do it for them. > > > > I don't have a feeling about it. > > > > I spent a year engineering a complete system which exercises the maximum > > amount of memory you can lock. > > > > I saw nothing like what you are describing. I had PROT_IMMUTABLE in my > > drafts, and saw it turning into a dangerous anti-pattern. > > > > > Also, I believe adding this to mmap() has no downsides, only > > > performance gain, as Pedro Falcato pointed out in [1]. > > > > > > [1] https://lore.kernel.org/lkml/CAKbZUD2A+=bp_sd+Q0Yif7NJqMu8p__eb4yguq0agEcmLH8SDQ@xxxxxxxxxxxxxx/ > > > > Are you joking? You don't have any code doing that today. More feelings? > > The 'no downside" is to combining two calls together; mmap() & mseal(), > at least that is how I read the linked discussion. > > The common case (since there are no users today) of just calling > mmap()/munmap() will have the downside. > > There will be a performance impact once you have can_modify_mm() doing > more than just returning true. Certainly, the impact will be larger > in munmap where multiple VMAs may need to be checked (assuming that's > the plan?). > > This will require a new and earlier walk of the vma tree while holding > the mmap_lock. Since you are checking (potentially multiple) VMAs for > something, I don't think there is a way around holding the lock. > > I'm not saying the cost will be large, but it will be a positive > non-zero number. For future glibc changes, I predict you will have zero cases where you can call mmap+immutable or mprotect+immutable, I say so, because I ended up having none. You always have to fill the memory. (At first glance you might think it works for a new DSO's BSS, but RELRO overlaps it, and since RELRO mprotect happens quite late, the permission locking is quite delayed relative to the allocation). I think chrome also won't lock memory at allocation. I suspect the generic allocator is quite seperate from the code using the allocation, which knows which objects can have their permissions locked and which objects can't. In OpenBSD, the only cases where we could set immutable at the same time as creating the mapping was in execve, for a new process's stack regions, and that is kernel code, not the userland exposed system call APIs. This change could skip adding PROT_MSEAL today, and add it later when there are facts the need. It's the same with MAP_MSEALABLE. I don't get it. So now there are 3 memory types: - cannot be sealed, ever - not yet sealed - sealed What purpose does the first type serve? Please explain the use case. Today, processes have control over their entire address space. What is the purpose of "permissions cannot be locked". Please supply an example. If I am wrong, I'd like to know where I went wrong.