Re: [PATCH] CAPABILITIES: remove undefined caps from all processes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Quoting Kees Cook (keescook@xxxxxxxxxxxx):
> On Wed, Jul 23, 2014 at 1:49 PM, Eric Paris <eparis@xxxxxxxxxx> wrote:
> > On Wed, 2014-07-23 at 13:46 -0700, Andy Lutomirski wrote:
> >> On 07/23/2014 12:36 PM, Eric Paris wrote:
> >> > This is effectively a revert of 7b9a7ec565505699f503b4fcf61500dceb36e744
> >> > plus fixing it a different way...
> >>
> >> You sent something like this a couple days ago.  What changed?
> >
> > right when I sent it I knew I forgot to do the -v2 type stuff.
> >
> > The new portions are fixes 3 and 4 below.  Which consists of masking
> > unknown caps from sys_setcap() and executing files with unknown
> > filecaps.
> 
> It might be good to add some tools/testing/selftests/ items for this?
> Then we can capture the corner cases with in-tree examples.

Hm, there *was* a testsuite in the ltp.  Might be worth dusting that off
and moving it in-tree.

> Regardless:
> 
> Reviewed-by: Kees Cook <keescook@xxxxxxxxxxxx>
> 
> >
> > -Eric
> >
> >> --Andy
> >>
> >> >
> >> > We found, when trying to run an application from an application which
> >> > had dropped privs that the kernel does security checks on undefined
> >> > capability bits.  This was ESPECIALLY difficult to debug as those
> >> > undefined bits are hidden from /proc/$PID/status.
> >> >
> >> > Consider a root application which drops all capabilities from ALL 4
> >> > capability sets.  We assume, since the application is going to set
> >> > eff/perm/inh from an array that it will clear not only the defined caps
> >> > less than CAP_LAST_CAP, but also the higher 28ish bits which are
> >> > undefined future capabilities.
> >> >
> >> > The BSET gets cleared differently.  Instead it is cleared one bit at a
> >> > time.  The problem here is that in security/commoncap.c::cap_task_prctl()
> >> > we actually check the validity of a capability being read.  So any task
> >> > which attempts to 'read all things set in bset' followed by 'unset all
> >> > things set in bset' will not even attempt to unset the undefined bits
> >> > higher than CAP_LAST_CAP.
> >> >
> >> > So the 'parent' will look something like:
> >> > CapInh:     0000000000000000
> >> > CapPrm:     0000000000000000
> >> > CapEff:     0000000000000000
> >> > CapBnd:     ffffffc000000000
> >> >
> >> > All of this 'should' be fine.  Given that these are undefined bits that
> >> > aren't supposed to have anything to do with permissions.  But they do...
> >> >
> >> > So lets now consider a task which cleared the eff/perm/inh completely
> >> > and cleared all of the valid caps in the bset (but not the invalid caps
> >> > it couldn't read out of the kernel).  We know that this is exactly what
> >> > the libcap-ng library does and what the go capabilities library does.
> >> > They both leave you in that above situation if you try to clear all of
> >> > you capapabilities from all 4 sets.  If that root task calls execve()
> >> > the child task will pick up all caps not blocked by the bset.  The bset
> >> > however does not block bits higher than CAP_LAST_CAP.  So now the child
> >> > task has bits in eff which are not in the parent.  These are
> >> > 'meaningless' undefined bits, but still bits which the parent doesn't
> >> > have.
> >> >
> >> > The problem is now in cred_cap_issubset() (or any operation which does a
> >> > subset test) as the child, while a subset for valid cap bits, is not a
> >> > subset for invalid cap bits!  So now we set durring commit creds that
> >> > the child is not dumpable.  Given it is 'more priv' than its parent.  It
> >> > also means the parent cannot ptrace the child and other stupidity.
> >> >
> >> > The solution here:
> >> > 1) stop hiding capability bits in status
> >> >     This makes debugging easier!
> >> >
> >> > 2) stop giving any task undefined capability bits.  it's simple, it you
> 
> typo: if/if
> 
> >> > don't put those invalid bits in CAP_FULL_SET you won't get them in init
> >> > and you won't get them in any other task either.
> >> >     This fixes the cap_issubset() tests and resulting fallout (which
> >> >     made the init task in a docker container untraceable among other
> >> >     things)
> >> >
> >> > 3) mask out undefined bits when sys_capset() is called as it might use
> >> > ~0, ~0 to denote 'all capabilities' for backward/forward compatibility.
> >> >     This lets 'capsh --caps="all=eip" -- -c /bin/bash' run.
> >> >
> >> > 4) mask out undefined bit when we read a file capability off of disk as
> >> > again likely all bits are set in the xattr for forward/backward
> >> > compatibility.
> >> >     This lets 'setcap all+pe /bin/bash; /bin/bash' run
> >> >
> >> > Signed-off-by: Eric Paris <eparis@xxxxxxxxxx>
> >> > Cc: Andrew Vagin <avagin@xxxxxxxxxx>
> >> > Cc: Andrew G. Morgan <morgan@xxxxxxxxxx>
> >> > Cc: Serge E. Hallyn <serge.hallyn@xxxxxxxxxxxxx>
> >> > Cc: Kees Cook <keescook@xxxxxxxxxxxx>
> >> > Cc: Steve Grubb <sgrubb@xxxxxxxxxx>
> >> > Cc: Dan Walsh <dwalsh@xxxxxxxxxx>
> >> > Cc: stable@xxxxxxxxxxxxxxx
> >> > ---
> >> >  fs/proc/array.c            | 11 +----------
> >> >  include/linux/capability.h |  5 ++++-
> >> >  kernel/audit.c             |  2 +-
> >> >  kernel/capability.c        |  4 ++++
> >> >  security/commoncap.c       |  3 +++
> >> >  5 files changed, 13 insertions(+), 12 deletions(-)
> >> >
> >> > diff --git a/fs/proc/array.c b/fs/proc/array.c
> >> > index 64db2bc..3e1290b 100644
> >> > --- a/fs/proc/array.c
> >> > +++ b/fs/proc/array.c
> >> > @@ -297,15 +297,11 @@ static void render_cap_t(struct seq_file *m, const char *header,
> >> >     seq_puts(m, header);
> >> >     CAP_FOR_EACH_U32(__capi) {
> >> >             seq_printf(m, "%08x",
> >> > -                      a->cap[(_KERNEL_CAPABILITY_U32S-1) - __capi]);
> >> > +                      a->cap[CAP_LAST_U32 - __capi]);
> >> >     }
> >> >     seq_putc(m, '\n');
> >> >  }
> >> >
> >> > -/* Remove non-existent capabilities */
> >> > -#define NORM_CAPS(v) (v.cap[CAP_TO_INDEX(CAP_LAST_CAP)] &= \
> >> > -                           CAP_TO_MASK(CAP_LAST_CAP + 1) - 1)
> >> > -
> >> >  static inline void task_cap(struct seq_file *m, struct task_struct *p)
> >> >  {
> >> >     const struct cred *cred;
> >> > @@ -319,11 +315,6 @@ static inline void task_cap(struct seq_file *m, struct task_struct *p)
> >> >     cap_bset        = cred->cap_bset;
> >> >     rcu_read_unlock();
> >> >
> >> > -   NORM_CAPS(cap_inheritable);
> >> > -   NORM_CAPS(cap_permitted);
> >> > -   NORM_CAPS(cap_effective);
> >> > -   NORM_CAPS(cap_bset);
> >> > -
> >> >     render_cap_t(m, "CapInh:\t", &cap_inheritable);
> >> >     render_cap_t(m, "CapPrm:\t", &cap_permitted);
> >> >     render_cap_t(m, "CapEff:\t", &cap_effective);
> >> > diff --git a/include/linux/capability.h b/include/linux/capability.h
> >> > index 84b13ad..aa93e5e 100644
> >> > --- a/include/linux/capability.h
> >> > +++ b/include/linux/capability.h
> >> > @@ -78,8 +78,11 @@ extern const kernel_cap_t __cap_init_eff_set;
> >> >  # error Fix up hand-coded capability macro initializers
> >> >  #else /* HAND-CODED capability initializers */
> >> >
> >> > +#define CAP_LAST_U32                       ((_KERNEL_CAPABILITY_U32S) - 1)
> >> > +#define CAP_LAST_U32_VALID_MASK            (CAP_TO_MASK(CAP_LAST_CAP + 1) -1)
> >> > +
> >> >  # define CAP_EMPTY_SET    ((kernel_cap_t){{ 0, 0 }})
> >> > -# define CAP_FULL_SET     ((kernel_cap_t){{ ~0, ~0 }})
> >> > +# define CAP_FULL_SET     ((kernel_cap_t){{ ~0, CAP_LAST_U32_VALID_MASK }})
> >> >  # define CAP_FS_SET       ((kernel_cap_t){{ CAP_FS_MASK_B0 \
> >> >                                 | CAP_TO_MASK(CAP_LINUX_IMMUTABLE), \
> >> >                                 CAP_FS_MASK_B1 } })
> >> > diff --git a/kernel/audit.c b/kernel/audit.c
> >> > index 3ef2e0e..ba2ff5a 100644
> >> > --- a/kernel/audit.c
> >> > +++ b/kernel/audit.c
> >> > @@ -1677,7 +1677,7 @@ void audit_log_cap(struct audit_buffer *ab, char *prefix, kernel_cap_t *cap)
> >> >     audit_log_format(ab, " %s=", prefix);
> >> >     CAP_FOR_EACH_U32(i) {
> >> >             audit_log_format(ab, "%08x",
> >> > -                            cap->cap[(_KERNEL_CAPABILITY_U32S-1) - i]);
> >> > +                            cap->cap[CAP_LAST_U32 - i]);
> >> >     }
> >> >  }
> >> >
> >> > diff --git a/kernel/capability.c b/kernel/capability.c
> >> > index a5cf13c..989f5bf 100644
> >> > --- a/kernel/capability.c
> >> > +++ b/kernel/capability.c
> >> > @@ -258,6 +258,10 @@ SYSCALL_DEFINE2(capset, cap_user_header_t, header, const cap_user_data_t, data)
> >> >             i++;
> >> >     }
> >> >
> >> > +   effective.cap[CAP_LAST_U32] &= CAP_LAST_U32_VALID_MASK;
> >> > +   permitted.cap[CAP_LAST_U32] &= CAP_LAST_U32_VALID_MASK;
> >> > +   inheritable.cap[CAP_LAST_U32] &= CAP_LAST_U32_VALID_MASK;
> >> > +
> >> >     new = prepare_creds();
> >> >     if (!new)
> >> >             return -ENOMEM;
> >> > diff --git a/security/commoncap.c b/security/commoncap.c
> >> > index b9d613e..963dc59 100644
> >> > --- a/security/commoncap.c
> >> > +++ b/security/commoncap.c
> >> > @@ -421,6 +421,9 @@ int get_vfs_caps_from_disk(const struct dentry *dentry, struct cpu_vfs_cap_data
> >> >             cpu_caps->inheritable.cap[i] = le32_to_cpu(caps.data[i].inheritable);
> >> >     }
> >> >
> >> > +   cpu_caps->permitted.cap[CAP_LAST_U32] &= CAP_LAST_U32_VALID_MASK;
> >> > +   cpu_caps->inheritable.cap[CAP_LAST_U32] &= CAP_LAST_U32_VALID_MASK;
> >> > +
> >> >     return 0;
> >> >  }
> >> >
> >> >
> >>
> >
> >
> 
> Thanks for fixing this!
> 
> -Kees
> 
> -- 
> Kees Cook
> Chrome OS Security
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line "unsubscribe stable" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]