* Yang Shi: > On Tue, Jan 30, 2024 at 11:53 PM Florian Weimer <fweimer@xxxxxxxxxx> wrote: >> >> * Yang Shi: >> >> > From: Yang Shi <yang@xxxxxxxxxxxxxxxxxxxxxx> >> > >> > The commit efa7df3e3bb5 ("mm: align larger anonymous mappings on THP >> > boundaries") incured regression for stress-ng pthread benchmark [1]. >> > It is because THP get allocated to pthread's stack area much more possible >> > than before. Pthread's stack area is allocated by mmap without VM_GROWSDOWN >> > or VM_GROWSUP flag, so kernel can't tell whether it is a stack area or not. >> > >> > The MAP_STACK flag is used to mark the stack area, but it is a no-op on >> > Linux. Mapping MAP_STACK to VM_NOHUGEPAGE to prevent from allocating >> > THP for such stack area. >> >> Doesn't this introduce a regression in the other direction, where >> workloads expect to use a hugepage TLB entry for the stack? > > Maybe, it is theoretically possible. But AFAICT, the real life > workloads performance usually gets hurt if THP is used for stack. > Willy has an example: > > https://lore.kernel.org/linux-mm/ZYPDwCcAjX+r+g6s@xxxxxxxxxxxxxxxxxxxx/#t > > And avoiding THP on stack is not new, VM_GROWSDOWN | VM_GROWSUP areas > have been applied before, this patch just extends this to MAP_STACK. If it's *always* beneficial then we should help it along in glibc as well. We've started to offer a tunable in response to this observation (also paper over in OpenJDK): Make thread stacks not use huge pages <https://bugs.openjdk.org/browse/JDK-8303215> But this is specifically about RSS usage, and not directly about reducing TLB misses etc. Thanks, Florian