Re: [PATCH] selinux: fix race condition when computing ocontext SIDs

Paul Moore <paul@xxxxxxxxxxxxxx> · Mon, 16 Aug 2021 16:38:25 -0400

On Mon, Aug 16, 2021 at 8:51 AM Ondrej Mosnacek <omosnace@xxxxxxxxxx> wrote:
> On Thu, Aug 5, 2021 at 10:48 PM Paul Moore <paul@xxxxxxxxxxxxxx> wrote:
> > On Wed, Jul 28, 2021 at 10:03 AM Ondrej Mosnacek <omosnace@xxxxxxxxxx> wrote:
> [...]
> > > diff --git a/security/selinux/ss/services.c b/security/selinux/ss/services.c
> > > index 0a5ce001609b..c8db8a3432e4 100644
> > > --- a/security/selinux/ss/services.c
> > > +++ b/security/selinux/ss/services.c
> > > @@ -2369,6 +2369,43 @@ err_policy:
> > >         return rc;
> > >  }
> > >
> > > +/**
> > > + * ocontext_to_sid - Helper to safely get sid for an ocontext
> > > + * @sidtab: SID table
> > > + * @c: ocontext structure
> > > + * @index: index of the context entry (0 or 1)
> > > + * @out_sid: pointer to the resulting SID value
> > > + *
> > > + * For all ocontexts except OCON_ISID the SID fields are populated
> > > + * on-demand when needed. Since updating the SID value is an SMP-sensitive
> > > + * operation, this helper must be used to do that safely.
> > > + *
> > > + * WARNING: This function may return -ESTALE, indicating that the caller
> > > + * must retry the operation after re-acquiring the policy pointer!
> > > + */
> > > +static int ocontext_to_sid(struct sidtab *sidtab, struct ocontext *c,
> > > +                          size_t index, u32 *out_sid)
> > > +{
> > > +       int rc;
> > > +       u32 sid;
> > > +
> > > +       /* Ensure the associated sidtab entry is visible to this thread. */
> > > +       sid = smp_load_acquire(&c->sid[index]);
> >
> > Is there a reason why you decided to use smp_{load,store} to guard
> > ocontext.sid[] as opposed to RCU/spinlock?  RCU would allow us to
> > avoid the memory barrier in smp_load_acquire() on every call to
> > ocontext_to_sid() and it looks like all of the non-exclusive callers
> > are already in a RCU protected section so the additional impact on an
> > already cached value should be next to nothing.  The spinlock would
> > make things slightly more complicated (take the lock, recheck
> > ocontext.sid, set/bail, unlock, etc.) but we are already in the slow
> > path at that point.
>
> I don't see any sane way to use RCU here - the implicit data
> dependency that the memory barrier is guarding us against here is
> between the sid field(s) and the memory regions in sidtab that hold
> the struct context corresponding to the SID stored in the field. You'd
> have to put both the SID value and the sidtab pointer behind some
> dynamically allocated structure and that would just be horrible...

I'm jumping between the io_uring stuff, stacking/audit issues, and now
this today and I fear that context switches have taken their toll
already and I haven't resolve this in my own mind yet.  However, since
this patch has been languishing in inboxes with a week or so between
replies I wanted to at least impart this comment today ...

You need to start thinking more about how this code is going to be
maintained long term.  To clarify, when I say "long term" I'm talking
about +10 years from now when most everyone involved has forgotten the
subtle data dependencies you're talking about now.  While memory
barriers are sometimes necessary due to the unavoidable complexity in
the kernel, they shouldn't be an excuse for properly crafting, and
synchronizing access to data structures.  My initial take two weeks
ago was that this patch has some nice ideas, e.g. the consolidation
into ocontext_to_sid(), but the use of memory barriers as a solution
to synchronizing the cached SID and underlying context was just not a
good first option (especially considering the minimal comments
explaining need for the barrier).  Maybe we get back to this solution
eventually, but I'm really hoping there is a better way, and your
commit made no comment about what other options you may have explored.

Please try to come up with a better solution that leverages proper
locking primitives, and if that isn't possible please explain (in
detail) why.

> I assume that by using spinlock you mean something like:
>
> sid = READ_ONCE(c->sid);
> if (!sid) {
>         spin_lock(...);
>         sidtab_context_to_sid(..., &sid);
>         WRITE_ONCE(c->sid, sid);
>         spin_unlock(...);
> }
>
> ...because taking the spinlock every time would obviously be less
> efficient than this patch :)

I didn't think I needed to clarify that, but yes, that is what I meant.

-- 
paul moore
www.paul-moore.com