Re: [PATCH v12 10/12] namei: aggressively check for nd->root escape on ".." resolution

Aleksa Sarai <cyphar@xxxxxxxxxx> · Thu, 5 Sep 2019 07:48:56 +1000

On 2019-09-04, Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
On Wed, Sep 4, 2019 at 1:23 PM Aleksa Sarai <cyphar@xxxxxxxxxx> wrote:
This patch allows for LOOKUP_BENEATH and LOOKUP_IN_ROOT to safely permit
".." resolution (in the case of LOOKUP_BENEATH the resolution will still
fail if ".." resolution would resolve a path outside of the root --
while LOOKUP_IN_ROOT will chroot(2)-style scope it). Magic-link jumps
are still disallowed entirely because now they could result in
inconsistent behaviour if resolution encounters a subsequent ".."[*].

This is the only patch in the series that makes me go "umm".

Why is it ok to re-initialize m_seq, which is used by other things
too? I think it's because we're out of RCU lookup, but there's no
comment about it, and it looks iffy to me. I'd rather have a separate
sequence count that doesn't have two users with different lifetime
rules.

Yeah, the reasoning was that it's because we're out of RCU lookup and if
we didn't re-grab ->m_seq we'd hit path_is_under() on every subsequent
".." (even though we've checked that it's safe). But yes, I should've
used a different field to avoid confusion (and stop it looking
unnecessarily dodgy). I will fix that.

But even apart from that, I think from a "patch continuity" standpoint
it would be better to introduce the sequence counts as just an error
condition first - iow, not have the "path_is_under()" check, but just
return -EXDEV if the sequence number doesn't match.

Ack, will do.

So you'd have three stages:

 1) ".." always returns -EXDEV

 2) ".." returns -EXDEV if there was a concurrent rename/mount

 3) ".." returns -EXDEV if there was a concurrent rename/mount and we
reset the sequence numbers and check if you escaped.

becasue the sequence number reset really does make me go "hmm", plus I
get this nagging little feeling in the back of my head that you can
cause nasty O(n^2) lookup cost behavior with deep paths, lots of "..",
and repeated path_is_under() calls.

The reason for doing the concurrent-{rename,mount} checks was to try to
avoid the O(n^2) in most cases, but you're right that if you have an
attacker that is spamming renames (or you're on a box with a lot of
renames and/or mounts going on *anywhere*) you will hit an O(n^2) here
(more pedantically, O(m*n) but who's counting?).

Unfortunately, I'm not sure what the best solution would be for this
one. If -EAGAIN retries are on the table, we could limit how many times
we're willing to do path_is_under() and then just return -EAGAIN.

So (1) sounds safe. (2) sounds simple. And (3) is where I think subtle
things start happening.

Also, I'm not 100% convinced that (3) is needed at all. I think the
retry could be done in user space instead, which needs to have a
fallback anyway. Yes? No?

Hinting to userspace to do a retry (with -EAGAIN as you mention in your
other mail) wouldn't be a bad thing at all, though you'd almost
certainly get quite a few spurious -EAGAINs -- &{mount,rename}_lock are
global for the entire machine, after all.

But if the only significant roadblock is that (3) seems a bit too hairy,
I would be quite happy with landing (2) as a first step (with -EAGAIN).

-- 
Aleksa Sarai
Senior Software Engineer (Containers)
SUSE Linux GmbH
<https://www.cyphar.com/>
Attachment:
signature.asc

Description: PGP signature