Forgot to mention: I realize my motivation is very specific to Chrome OS, however the nolinks option seemed useful also as a mitigation to generic privilege escalation symlink attacks, for cases where disabling symlinks/hardlinks is acceptable. On Fri, Oct 14, 2016 at 5:50 PM, Mattias Nissler <mnissler@xxxxxxxxxxxx> wrote: > On Fri, Oct 14, 2016 at 5:00 PM, Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote: >> >> On Fri, Oct 14, 2016 at 03:55:15PM +0100, Al Viro wrote: >> > > Setting the "nolinks" mount option helps prevent privileged writers >> > > from modifying files unintentionally in case there is an unexpected >> > > link along the accessed path. The "nolinks" option is thus useful as a >> > > defensive measure against persistent exploits (i.e. a system getting >> > > re-exploited after a reboot) for systems that employ a read-only or >> > > dm-verity-protected rootfs. These systems prevent non-legit binaries >> > > from running after reboot. However, legit code typically still reads >> > > from and writes to a writable file system previously under full >> > > control of the attacker, who can place symlinks to trick file writes >> > > after reboot to target a file of their choice. "nolinks" fundamentally >> > > prevents this. >> > >> > Which parts of the tree would be on that "protected" rootfs and which would >> > you mount with that option? Description above is rather vague and I'm >> > not convinced that it actually buys you anything. Details, please... > > Apologies for the vague description, I'm happy to explain in detail. > > In case of Chrome OS, we have all binaries on a dm-verity rootfs, so > an attacker can't modify any binaries. After reboot, everything except > the rootfs is mounted noexec, so there's no way to re-gain code > execution after reboot by modifying existing binaries or dropping new > ones. > > We've seen multiple exploits now where the attacker worked around > these limitations in two steps: > > 1. Before reboot, the attacker sets up symlinks on the writeable file > system (called "stateful" file system), which are later accessed by > legit boot code (such as init scripts) after reboot. For example, an > init script that copies file A to B can be abused by an attacker by > symlinking or hardlinking B to a location C of their choice, and > placing desired data to be written to C in A. That gives the attacker > a primitive to write data of their choice to a path of their choice > after reboot. Note that this primitive may target locations _outside_ > the stateful file system the attacker previously had control of. > Particularly of interest are targets on /sys, but also tmpfs on /run > etc. > > 2. The second step for a successful attack is finding some legit code > invoked in the boot flow that has a vulnerability exploitable by > feeding it unexpected data. As an example, there are Linux userspace > utilities that read config from /run which may contain shell commands > the the utility executes, through which the attacker can gain code > execution again. > > The purpose of the proposed patch is to raise the bar for the first > step of the attack: Writing arbitrary files after reboot. I'm > intending to mount the stateful file system with the nolinks option > (or otherwise prevent symlink traversal). This will help make sure > that any legit writes taking place during boot in init scripts etc. go > to the files intended by the developer, and can't be redirected by an > attacker. > > Does this make more sense to you? > >> >> >> PS: what the hell do restrictions on _following_ symlinks have to _creating_ >> hardlinks? I'm trying to imagine a threat model where both would apply or >> anything else beyond the word "link" they would have in common... > > The restriction is not on _creating_ hard links, but _opening_ > hardlinks. The commonality is in the confusion between the file you're > meaning to write vs. the file you actually end up writing to, which > stems from the fact that as things stand a file can be accessible on > other paths than its canonical one. For Chrome OS, I'd like to get to > a point where most privileged code can only access a file via its > canonical name (bind mounts are an OK exception as they're not > persistent, so out of reach for manipulation). > >> >> The one you've described above might have something to do with the first >> one (modulo missing description of the setup you have in mind), but it >> clearly has nothing to do with the second - attackers could've created >> whatever they wanted while the fs had been under their control, after all. >> Doesn't make sense... -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html