Re: [PATCH] [RFC] mm: mmap: Allow mmap(MAP_STACK) to map growable stack

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 9/12/24 03:08, Yang Shi wrote:
On Wed, Sep 11, 2024 at 5:50 PM Helge Deller <deller@xxxxxx> wrote:

On 9/12/24 01:05, Liam R. Howlett wrote:
* Yang Shi <shy828301@xxxxxxxxx> [240911 18:16]:
On Wed, Sep 11, 2024 at 12:49 PM Liam R. Howlett
<Liam.Howlett@xxxxxxxxxx> wrote:

* Helge Deller <deller@xxxxxxxxxx> [240911 15:20]:
This is a RFC to change the behaviour of mmap(MAP_STACK) to be
sufficient to map memory for usage as stack on all architectures.
Currently MAP_STACK is a no-op on Linux, and instead MAP_GROWSDOWN
has to be used.
To clarify, here is the relevant info from the mmap() man page:

MAP_GROWSDOWN
     This flag is used for stacks. It indicates to the kernel virtual
     memory system that the mapping should extend downward in memory.  The
     return address is one page lower than the memory area that is
     actually created in the process's virtual address space.  Touching an
     address in the "guard" page below the mapping will cause the mapping
     to grow by a page. This growth can be repeated until the mapping
     grows to within a page of the high end of the next lower mapping,
     at which point touching the "guard" page will result in a SIGSEGV
     signal.

MAP_STACK (since Linux 2.6.27)
     Allocate the mapping at an address suitable for a process or thread
     stack.

     This flag is currently a no-op on Linux. However, by employing this
     flag, applications can ensure that they transparently obtain support
     if the flag is implemented in the future. Thus, it is used in the
     glibc threading implementation to allow for the fact that
     some architectures may (later) require special treatment for
     stack allocations. A further reason to employ this flag is
     portability: MAP_STACK exists (and has an effect) on some
     other systems (e.g., some of the BSDs).

The reason to suggest this change is, that on the parisc architecture the
stack grows upwards. As such, using solely the MAP_GROWSDOWN flag will not
work. Note that there exists no MAP_GROWSUP flag.
By changing the behaviour of MAP_STACK to mark the memory area with the
VM_STACK bit (which is VM_GROWSUP or VM_GROWSDOWN depending on the
architecture) the MAP_STACK flag does exactly what people would expect on
all platforms.

This change should have no negative side-effect, as all code which
used mmap(MAP_GROWSDOWN | MAP_STACK) still work as before.

Signed-off-by: Helge Deller <deller@xxxxxx>

diff --git a/include/linux/mman.h b/include/linux/mman.h
index bcb201ab7a41..66bc72a0cb19 100644
--- a/include/linux/mman.h
+++ b/include/linux/mman.h
@@ -156,6 +156,7 @@ calc_vm_flag_bits(unsigned long flags)
        return _calc_vm_trans(flags, MAP_GROWSDOWN,  VM_GROWSDOWN ) |
               _calc_vm_trans(flags, MAP_LOCKED,     VM_LOCKED    ) |
               _calc_vm_trans(flags, MAP_SYNC,       VM_SYNC      ) |
+            _calc_vm_trans(flags, MAP_STACK,      VM_STACK     ) |

Right now MAP_STACK can be used to set VM_NOHUGEPAGE, but this will
change the user interface to create a vma that will grow.  I'm not
entirely sure this is okay?

AFAICT, I don't see this is a problem. Currently huge page also skips
the VMAs with VM_GROWS* flags set. See vma_is_temporary_stack().
__thp_vma_allowable_orders() returns 0 if the vma is a temporary
stack.

If someone is using MAP_STACK to avoid having a huge page, they will
also get a mapping that grows - which is different than what happens
today.

I'm not saying that's right, but someone could be abusing the existing
flag and this will change the behaviour.

Wouldn't a plain mmap() followed by madvise(MADV_NOHUGEPAGE) do exactly that?
Why abusing MAP_STACK for that?

Different sources and reports showed having huge pages for stack
mapping hurts performance. A lot of applications, for example, pthread
lib, allocate stack with MAP_STACK and they don't call MADV_NOHUGEPAGE
on stack mapping.

That's true, and my patch does not change the behaviour wrt huge pages.
Using MAP_STACK still tags the area VM_NOHUGEPAGE. See below...


That is mmap(MAP_STACK) would set VM_NOHUGEPAGE right now, with this
change you'd get VM_NOHUGEPAGE | VM_GROWS<something>

               _calc_vm_trans(flags, MAP_STACK,      VM_NOHUGEPAGE) |
               arch_calc_vm_flag_bits(flags);
   }







[Index of Archives]     [Linux SoC]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux