From: John Hubbard <jhubbard@xxxxxxxxxx> -- Expand the documentation to discuss the hazards in enough detail to allow avoiding them. -- Mention the upcoming MAP_FIXED_SAFE flag. -- Enhance the alignment requirement slightly. CC: Michael Ellerman <mpe@xxxxxxxxxxxxxx> CC: Jann Horn <jannh@xxxxxxxxxx> CC: Matthew Wilcox <willy@xxxxxxxxxxxxx> CC: Michal Hocko <mhocko@xxxxxxxxxx> CC: Mike Rapoport <rppt@xxxxxxxxxxxxxxxxxx> CC: Cyril Hrubis <chrubis@xxxxxxx> CC: Michal Hocko <mhocko@xxxxxxxx> CC: Pavel Machek <pavel@xxxxxx> Signed-off-by: John Hubbard <jhubbard@xxxxxxxxxx> --- Changes since v4: -- v2 ("mmap.2: MAP_FIXED is no longer discouraged") was applied already, so v5 is a merge, including rewording of the paragraph transitions. -- We seem to have consensus about what to say about alignment now, and this includes that new wording. Changes since v3: -- Removed the "how to use this safely" part, and the SHMLBA part, both as a result of Michal Hocko's review. -- A few tiny wording fixes, at the not-quite-typo level. Changes since v2: -- Fixed up the "how to use safely" example, in response to Mike Rapoport's review. -- Changed the alignment requirement from system page size, to SHMLBA. This was inspired by (but not yet recommended by) Cyril Hrubis' review. -- Formatting: underlined /proc/<pid>/maps Changes since v1: -- Covered topics recommended by Matthew Wilcox and Jann Horn, in their recent review: the hazards of overwriting pre-exising mappings, and some notes about how to use MAP_FIXED safely. -- Rewrote the commit description accordingly. man2/mmap.2 | 32 ++++++++++++++++++++++++++++++-- 1 file changed, 30 insertions(+), 2 deletions(-) diff --git a/man2/mmap.2 b/man2/mmap.2 index a5a8eb47a..400cfda2d 100644 --- a/man2/mmap.2 +++ b/man2/mmap.2 @@ -212,8 +212,9 @@ Don't interpret .I addr as a hint: place the mapping at exactly that address. .I addr -must be a multiple of the page size. -If the memory region specified by +must be suitably aligned: for most architectures a multiple of page +size is sufficient; however, some architectures may impose additional +restrictions. If the memory region specified by .I addr and .I len @@ -226,6 +227,33 @@ Software that aspires to be portable should use this option with care, keeping in mind that the exact layout of a process' memory map is allowed to change significantly between kernel versions, C library versions, and operating system releases. +.IP +Furthermore, this option is extremely hazardous (when used on its own), because +it forcibly removes pre-existing mappings, making it easy for a multi-threaded +process to corrupt its own address space. +.IP +For example, thread A looks through +.I /proc/<pid>/maps +and locates an available +address range, while thread B simultaneously acquires part or all of that same +address range. Thread A then calls mmap(MAP_FIXED), effectively overwriting +the mapping that thread B created. +.IP +Thread B need not create a mapping directly; simply making a library call +that, internally, uses +.I dlopen(3) +to load some other shared library, will +suffice. The dlopen(3) call will map the library into the process's address +space. Furthermore, almost any library call may be implemented using this +technique. +Examples include brk(2), malloc(3), pthread_create(3), and the PAM libraries +(http://www.linux-pam.org). +.IP +Newer kernels +(Linux 4.16 and later) have a +.B MAP_FIXED_SAFE +option that avoids the corruption problem; if available, MAP_FIXED_SAFE +should be preferred over MAP_FIXED. .TP .B MAP_GROWSDOWN This flag is used for stacks. -- 2.15.1 -- To unsubscribe from this list: send the line "unsubscribe linux-api" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html