On 24/11/2022 02.18, Jason A. Donenfeld wrote: > Hi Rasmus, > > On Wed, Nov 23, 2022 at 09:51:04AM +0100, Rasmus Villemoes wrote: >> On 21/11/2022 16.29, Jason A. Donenfeld wrote: >> >> Cc += linux-api >> >>> >>> if (!new_block) >>> goto out; >>> new_cap = grnd_allocator.cap + num; >>> new_states = reallocarray(grnd_allocator.states, new_cap, sizeof(*grnd_allocator.states)); >>> if (!new_states) { >>> munmap(new_block, num * size_per_each); >> >> Hm. This does leak an implementation detail of vgetrandom_alloc(), >> namely that it is based on mmap() of that size rounded up to page size. >> Do we want to commit to this being the proper way of disposing of a >> succesful vgetrandom_alloc(), or should there also be a >> vgetrandom_free(void *states, long num, long size_per_each)? > > Yes, this is intentional, and this is exactly what I wanted to do. There > are various wrappers of vm_mmap() throughout, mmap being one of them, > and they typically then resort to munmap to unmap it. This is how > userspace handles memory - maps, always maps. So I think doing that is > fine and consistent. OK. Perhaps for the benefit of future libc implementors drop a comment somewhere as to how to dealloc the blob. > However, your point about it relying on it being a rounded up size isn't > correct. `munmap` will unmap the whole page if the size you pass lies > within a page. So `num * size_of_each` will always do the right thing, > without needing to have userspace code round anything up. (From the man > page: "The address addr must be a multiple of the page size (but length > need not be). I know, and I never said userspace needed to round anything up. All pages containing a part of the indicated range are > unmapped.") And as you can see in my example code, nothing is rounded > up. So I don't know why you made that comment. I made that comment because it's clear from what this does that you get something back that is _at least_ num*size_per_each in size, but what is not clear is that the actual allocation is exactly and will always be that size rounded up to a page size (and no more), so that munmap(num*size_per_each), with its well-known and documented semantics, will DTRT. > I think adding more control is exactly what this is trying to avoid. > It's very intentionally *not* a general allocator function, but > something specific for vDSO getrandom(). However, it does already, in > this very patchset here, take a (currently unused) flags argument, in > case we have the need for later extension. OK. Perhaps you can spend a few more words on why this allocation _needs_ to be MAP_LOCKED? That seems somewhat of a policy thing imposed by the kernel, something that would be better left to the libc or distro or whatnot to request via a flag. I could imagine applications that currently run at the mlock limit start failing after a libc upgrade - which could of course be considered a libc problem, and perhaps it's too unlikely to worry about. Rasmus