On 01/02/2021 19:55, Vlastimil Babka wrote: > On 2/1/21 7:00 PM, Milan Broz wrote: >> On 01/02/2021 14:08, Vlastimil Babka wrote: >>> On 1/8/21 3:39 PM, Milan Broz wrote: >>>> On 08/01/2021 14:41, Michal Hocko wrote: >>>>> On Wed 06-01-21 16:20:15, Milan Broz wrote: >>>>>> Hi, >>>>>> >>>>>> we use mlockall(MCL_CURRENT | MCL_FUTURE) / munlockall() in cryptsetup code >>>>>> and someone tried to use it with hardened memory allocator library. >>>>>> >>>>>> Execution time was increased to extreme (minutes) and as we found, the problem >>>>>> is in munlockall(). >>>>>> >>>>>> Here is a plain reproducer for the core without any external code - it takes >>>>>> unlocking on Fedora rawhide kernel more than 30 seconds! >>>>>> I can reproduce it on 5.10 kernels and Linus' git. >>>>>> >>>>>> The reproducer below tries to mmap large amount memory with PROT_NONE (later never used). >>>>>> The real code of course does something more useful but the problem is the same. >>>>>> >>>>>> #include <stdio.h> >>>>>> #include <stdlib.h> >>>>>> #include <fcntl.h> >>>>>> #include <sys/mman.h> >>>>>> >>>>>> int main (int argc, char *argv[]) >>>>>> { >>>>>> void *p = mmap(NULL, 1UL << 41, PROT_NONE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); > > So, this is 2TB memory area, but PROT_NONE means it's never actually populated, > although mlockall(MCL_CURRENT) should do that. Once you put PROT_READ | > PROT_WRITE there, the mlockall() starts taking ages. > > So does that reflect your use case? munlockall() with large PROT_NONE areas? If > so, munlock_vma_pages_range() is indeed not optimized for that, but I would > expect such scenario to be uncommon, so better clarify first. It is just a simple reproducer of the underlying problem, as suggested here https://gitlab.com/cryptsetup/cryptsetup/-/issues/617#note_478342301 We use mlockall() in cryptsetup and with hardened malloc it slows down unlock significantly. (For the real case problem please read the whole issue report above.) m.