On Mon, Apr 16, 2018 at 9:18 PM, Michal Hocko <mhocko@xxxxxxxxxx> wrote: > On Mon 16-04-18 15:55:36, Jann Horn wrote: >> On Mon, Apr 16, 2018 at 12:07 PM, Michal Hocko <mhocko@xxxxxxxxxx> wrote: >> > On Fri 13-04-18 18:17:36, Jann Horn wrote: >> >> On Fri, Apr 13, 2018 at 6:05 PM, Jann Horn <jannh@xxxxxxxxxx> wrote: >> >> > On Fri, Apr 13, 2018 at 6:04 PM, Michal Hocko <mhocko@xxxxxxxxxx> wrote: >> >> >> On Fri 13-04-18 17:04:09, Jann Horn wrote: >> >> >>> On Fri, Apr 13, 2018 at 8:49 AM, Michal Hocko <mhocko@xxxxxxxxxx> wrote: >> >> >>> > On Fri 13-04-18 08:43:27, Michael Kerrisk wrote: >> >> >>> > [...] >> >> >>> >> So, you mean remove this entire paragraph: >> >> >>> >> >> >> >>> >> For cases in which the specified memory region has not been >> >> >>> >> reserved using an existing mapping, newer kernels (Linux >> >> >>> >> 4.17 and later) provide an option MAP_FIXED_NOREPLACE that >> >> >>> >> should be used instead; older kernels require the caller to >> >> >>> >> use addr as a hint (without MAP_FIXED) and take appropriate >> >> >>> >> action if the kernel places the new mapping at a different >> >> >>> >> address. >> >> >>> >> >> >> >>> >> It seems like some version of the first half of the paragraph is worth >> >> >>> >> keeping, though, so as to point the reader in the direction of a remedy. >> >> >>> >> How about replacing that text with the following: >> >> >>> >> >> >> >>> >> Since Linux 4.17, the MAP_FIXED_NOREPLACE flag can be used >> >> >>> >> in a multithreaded program to avoid the hazard described >> >> >>> >> above. >> >> >>> > >> >> >>> > Yes, that sounds reasonable to me. >> >> >>> >> >> >>> But that kind of sounds as if you can't avoid it before Linux 4.17, >> >> >>> when actually, you just have to call mmap() with the address as hint, >> >> >>> and if mmap() returns a different address, munmap() it and go on your >> >> >>> normal error path. >> >> >> >> >> >> This is still racy in multithreaded application which is the main point >> >> >> of the whole section, no? >> >> > >> >> > No, it isn't. >> > >> > I could have been more specific, sorry. >> > >> >> mmap() with a hint (without MAP_FIXED) will always non-racily allocate >> >> a memory region for you or return an error code. If it does allocate a >> >> memory region, it belongs to you until you deallocate it. It might be >> >> at a different address than you requested - >> > >> > Yes, this all is true. Except the atomicity is guaranteed only for the >> > syscall. Once you return to the userspace any error handling is error >> > prone and racy because your mapping might change under you feet. So... >> >> Can you please elaborate on why you think anything could change the >> mapping returned by mmap() under the caller's feet? > > Because as soon as the mmap_sem is dropped then any other thread can > modify the shared address space. > >> When mmap() returns a memory area to the caller, that memory area >> belongs to the caller. No unrelated code will touch it, unless that >> code is buggy. > > Yes, reasonably well written application will not have this problem. > That, however, requires an external synchronization and that's why > called it error prone and racy. I guess that was the main motivation for > that part of the man page. What requires external synchronization? I still don't understand at all what you're talking about. The following code: void *try_to_alloc_addr(void *hint, size_t len) { char *x = mmap(hint, len, ...); if (x == MAP_FAILED) return NULL; if (x == hint) return x; munmap(x, len); return NULL; } has no need for any form of external synchronization. You can call it in library code, you can call it in a multithreaded process, you can call it wherever and it should be safe. mmap() atomically reserves previously unallocated memory, and nothing else should be touching that memory until it is released again using munmap(). (Just like malloc(): When you call malloc(), you get a chunk of memory that is reserved just for you, and nobody else will scribble over it until you call free().) -- To unsubscribe from this list: send the line "unsubscribe linux-api" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html