On Tue, Aug 16, 2011 at 4:54 PM, Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> wrote: > On Mon, 15 Aug 2011 15:57:35 -0500 > Will Drewry <wad@xxxxxxxxxxxx> wrote: > >> This patch proposes a sysctl knob that allows a privileged user to >> disable ~VM_MAYEXEC tainting when mapping in a vma from a MNT_NOEXEC >> mountpoint. It does not alter the normal behavior resulting from >> attempting to directly mmap(PROT_EXEC) a vma (-EPERM) nor the behavior >> of any other subsystems checking MNT_NOEXEC. >> >> It is motivated by a common /dev/shm, /tmp usecase. There are few >> facilities for creating a shared memory segment that can be remapped in >> the same process address space with different permissions. Often, a >> file in /tmp provides this functionality. However, on distributions >> that are more restrictive/paranoid, world-writeable directories are >> often mounted "noexec". The only workaround to support software that >> needs this behavior is to either not use that software or remount /tmp >> exec. > > Remounting /tmp would appear to have the same effect as altering this > sysctl, so why not just remount /tmp? The main difference is that you still achieve the primary goals of noexec without the secondary: 1. exec still fails 2. mmap(PROT_EXEC) still fails This means that with a common gnu-ish userspace, it's not possible to execute an arbitrary binary in /tmp or use it as a preload or dlopen() source. It's like half-noexec. >> (E.g., https://bugs.gentoo.org/350336?id=350336) Given that >> the only recourse is using SysV IPC, the application programmer loses >> many of the useful ABI features that they get using a mmap'd file (and >> as such are often hesitant to explore that more painful path). >> >> With this patch, it would be possible to change the sysctl variable >> such that mprotect(PROT_EXEC) would succeed. In cases like the example >> above, an additional userspace mmap-wrapper would be needed, but in >> other cases, like how code.google.com/p/nativeclient mmap()s then >> mprotect()s, the behavior would be unaffected. >> >> The tradeoff is a loss of defense in depth, but it seems reasonable when >> the alternative is to disable the defense entirely. >> >> ... >> >> --- a/kernel/sysctl.c >> +++ b/kernel/sysctl.c >> @@ -89,6 +89,9 @@ >> /* External variables not in a header file. */ >> extern int sysctl_overcommit_memory; >> extern int sysctl_overcommit_ratio; >> +#ifdef CONFIG_MMU > > The ifdef isn't needed in the header and we generally omit it to avoid > clutter. Thanks - I'll remove it! > afaict this feature could be made available on NOMMU systems? When I poked around I didn't see VM_MAYEXEC being used in NOMMU systems, but I may have just been misreading! I'll relook. >> +extern int sysctl_mmap_noexec_taint; > > The term "taint" has a specific meaning in the kernel (see > add_taint()). It's regrettable that this patch attaches a second > meaning to that term. Can we think of a better word to use? > > A better word would communicate the sense of the sysctl operation. If > a "taint" flag is set to true, I don't know whether that means that > noexec is enabled or disabled. Something like > sysctl_mmap_noexec_override or sysctl_mmap_noexec_disable, perhaps. Thanks for the good points and suggestions. Maybe something like sysctl_mprotect_ignores_noexec would reflect this more closely, though still not quite as accurately as your examples. (hrm, maybe sysctl_mmap_noexec_propagates) > This patch forgot to document the new feature and its sysctl. > Documentation/sysctl/vm.txt might be the right place. I will add that along with the changes from your other comments. Thanks! will -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href