On Mon, Apr 5, 2021 at 5:11 AM Ondrej Mosnacek <omosnace@xxxxxxxxxx> wrote: > > Since commit 1b8b31a2e612 ("selinux: convert policy read-write lock to > RCU"), there is a small window during policy load where the new policy > pointer has already been installed, but some threads may still be > holding the old policy pointer in their read-side RCU critical sections. > This means that there may be conflicting attempts to add a new SID entry > to both tables via sidtab_context_to_sid(). > > See also (and the rest of the thread): > https://lore.kernel.org/selinux/CAFqZXNvfux46_f8gnvVvRYMKoes24nwm2n3sPbMjrB8vKTW00g@xxxxxxxxxxxxxx/ > > Fix this by installing the new policy pointer under the old sidtab's > spinlock along with marking the old sidtab as "frozen". Then, if an > attempt to add new entry to a "frozen" sidtab is detected, make > sidtab_context_to_sid() return -ESTALE to indicate that a new policy > has been installed and that the caller will have to abort the policy > transaction and try again after re-taking the policy pointer (which is > guaranteed to be a newer policy). This requires adding a retry-on-ESTALE > logic to all callers of sidtab_context_to_sid(), but fortunately these > are easy to determine and aren't that many. > > This seems to be the simplest solution for this problem, even if it > looks somewhat ugly. Note that other places in the kernel (e.g. > do_mknodat() in fs/namei.c) use similar stale-retry patterns, so I think > it's reasonable. > > Fixes: 1b8b31a2e612 ("selinux: convert policy read-write lock to RCU") > Signed-off-by: Ondrej Mosnacek <omosnace@xxxxxxxxxx> > --- > security/selinux/ss/services.c | 145 ++++++++++++++++++++++++++------- > security/selinux/ss/sidtab.c | 21 +++++ > security/selinux/ss/sidtab.h | 4 + > 3 files changed, 139 insertions(+), 31 deletions(-) I'm glad to see the retry approach here, that looks like a good solution to me. I did have a few comments below. > diff --git a/security/selinux/ss/services.c b/security/selinux/ss/services.c > index 08ac1d01c743..50abcfcdf242 100644 > --- a/security/selinux/ss/services.c > +++ b/security/selinux/ss/services.c > @@ -1736,6 +1746,7 @@ static int security_compute_sid(struct selinux_state *state, > goto out; > } > > +retry: > context_init(&newcontext); > > rcu_read_lock(); > @@ -1880,6 +1891,11 @@ static int security_compute_sid(struct selinux_state *state, > } > /* Obtain the sid for the context. */ > rc = sidtab_context_to_sid(sidtab, &newcontext, out_sid); > + if (rc == -ESTALE) { > + rcu_read_unlock(); > + context_destroy(&newcontext); Do we also need to reset 'cladatum' to NULL here as well? > + goto retry; > + } > out_unlock: > rcu_read_unlock(); > context_destroy(&newcontext); ... > @@ -2510,7 +2551,7 @@ int security_netif_sid(struct selinux_state *state, > struct selinux_policy *policy; > struct policydb *policydb; > struct sidtab *sidtab; > - int rc = 0; > + int rc; > struct ocontext *c; > > if (!selinux_initialized(state)) { > @@ -2518,6 +2559,8 @@ int security_netif_sid(struct selinux_state *state, > return 0; > } > > +retry: > + rc = 0; > rcu_read_lock(); > policy = rcu_dereference(state->policy); > policydb = &policy->policydb; > @@ -2534,10 +2577,18 @@ int security_netif_sid(struct selinux_state *state, > if (!c->sid[0] || !c->sid[1]) { > rc = sidtab_context_to_sid(sidtab, &c->context[0], > &c->sid[0]); > + if (rc == -ESTALE) { > + rcu_read_unlock(); > + goto retry; > + } > if (rc) > goto out; > rc = sidtab_context_to_sid(sidtab, &c->context[1], > &c->sid[1]); > + if (rc == -ESTALE) { > + rcu_read_unlock(); > + goto retry; > + } If the first sidtab_context_to_sid() ran successfully we are not going to see -ESTALE here, correct? Assuming yes, perhaps we can drop this -ESTALE check with a comment about how a frozen sidtab will be caught by the call above. > if (rc) > goto out; > } ... > @@ -2676,18 +2732,20 @@ int security_get_user_sids(struct selinux_state *state, > struct sidtab *sidtab; > struct context *fromcon, usercon; > u32 *mysids = NULL, *mysids2, sid; > - u32 mynel = 0, maxnel = SIDS_NEL; > + u32 i, j, mynel, maxnel = SIDS_NEL; > struct user_datum *user; > struct role_datum *role; > struct ebitmap_node *rnode, *tnode; > - int rc = 0, i, j; > + int rc; > > *sids = NULL; > *nel = 0; > > if (!selinux_initialized(state)) > - goto out; > + return 0; > > +retry: > + mynel = 0; > rcu_read_lock(); > policy = rcu_dereference(state->policy); > policydb = &policy->policydb; > @@ -2723,6 +2781,10 @@ int security_get_user_sids(struct selinux_state *state, > continue; > > rc = sidtab_context_to_sid(sidtab, &usercon, &sid); > + if (rc == -ESTALE) { > + rcu_read_unlock(); Don't we need to free and reset 'mysids' here? 'mynel' and 'maxnel' are probably less critical since the -ESTALE condition should happen before they are ever modified, but a constant assignment is pretty cheap so it may make sense to reset those as well. > + goto retry; > + } > if (rc) > goto out_unlock; > if (mynel < maxnel) { > @@ -2745,14 +2807,14 @@ out_unlock: > rcu_read_unlock(); > if (rc || !mynel) { > kfree(mysids); > - goto out; > + return rc; > } > > rc = -ENOMEM; > mysids2 = kcalloc(mynel, sizeof(*mysids2), GFP_KERNEL); > if (!mysids2) { > kfree(mysids); > - goto out; > + return rc; > } > for (i = 0, j = 0; i < mynel; i++) { > struct av_decision dummy_avd; > @@ -2765,12 +2827,10 @@ out_unlock: > mysids2[j++] = mysids[i]; > cond_resched(); > } > - rc = 0; > kfree(mysids); > *sids = mysids2; > *nel = j; > -out: > - return rc; > + return 0; > } > > /** -- paul moore www.paul-moore.com