On 09.03.21 10:33, Bruce Merry wrote:
Hi
I've run into a problem with using mmap(..., MAP_ANONYMOUS |
MAP_POPULATE | MAP_HUGETLB). If there are no huge pages available due to
vm.nr_hugepages (or hugetlb.2MB.rsvd.limit_in_bytes cgroup setting) then
the mmap call fails and I can gracefully fall back to 4KB pages.
However, if neither of the above apply but hugetlb.2MB.limit_in_bytes
prevents pages being mapped, then it appears that MAP_POPULATE is
silently ignored (according to mincore), and rather than being able to
gracefully fall back, attempting to use the memory results in SIGBUS.
I would have imagined that the hugepage reservation would fail. But
looks like they might get reserved, however, actual population is
restricted using cgroups later.
Huge page reservation is actually pretty weird in some special cases
(including NUMA bindings).
Is that expected behaviour? I don't see anything in the mmap(2) man page
about it being best-effort (in contrast to MAP_LOCKED, which explicitly
says the call won't fail if it can't lock the memory).
I think it has been best-effort forever, just like MAP_LOCKED.
You could use memfd_create() to create an anonymous file backed by huge
pages, then try allocating backend storage using fallocate() - which
fails in a safe way. You just have to make sure to map it MAP_SHARED
later to avoid nasty side effects with private mappings + fallocate().
This is on Linux 5.8 on Ubuntu 20.04. I can provide sample code if it's
of interest, or test on a newer kernel if it'll help.
Note that I'm working on a reliable populate mechanism that can also
work on parts of a mapping only, especially relevant in combination with
MAP_NORESERVE. Not sure if that applies to your use case, sounds like
memfd_create() +fallocate() could be good enough - unless you also
really want to have all page tables properly populated already or really
need MAP_PRIVATE.
https://lkml.kernel.org/r/20210308164520.18323-1-david@xxxxxxxxxx
--
Thanks,
David / dhildenb