Re: [RFC] IMA LSM based rule race condition issue on 4.19 LTS

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Dec 15, 2022 at 9:30 AM Mimi Zohar <zohar@xxxxxxxxxxxxx> wrote:
>
> On Thu, 2022-12-15 at 21:15 +0800, Guozihua (Scott) wrote:
> > On 2022/12/15 18:49, Mimi Zohar wrote:
> > > On Thu, 2022-12-15 at 16:51 +0800, Guozihua (Scott) wrote:
> > >> On 2022/12/14 20:19, Mimi Zohar wrote:
> > >>> On Wed, 2022-12-14 at 09:33 +0800, Guozihua (Scott) wrote:
> > >>>> On 2022/12/13 23:30, Mimi Zohar wrote:
> > >>>>> On Fri, 2022-12-09 at 15:00 +0800, Guozihua (Scott) wrote:
> > >>>>>> Hi community.
> > >>>>>>
> > >>>>>> Previously our team reported a race condition in IMA relates to LSM
> > >>>>>> based rules which would case IMA to match files that should be filtered
> > >>>>>> out under normal condition. The issue was originally analyzed and fixed
> > >>>>>> on mainstream. The patch and the discussion could be found here:
> > >>>>>> https://lore.kernel.org/all/20220921125804.59490-1-guozihua@xxxxxxxxxx/
> > >>>>>>
> > >>>>>> After that, we did a regression test on 4.19 LTS and the same issue
> > >>>>>> arises. Further analysis reveled that the issue is from a completely
> > >>>>>> different cause.
> > >>>>>>
> > >>>>>> The cause is that selinux_audit_rule_init() would set the rule (which is
> > >>>>>> a second level pointer) to NULL immediately after called. The relevant
> > >>>>>> codes are as shown:
> > >>>>>>
> > >>>>>> security/selinux/ss/services.c:
> > >>>>>>> int selinux_audit_rule_init(u32 field, u32 op, char *rulestr, void **vrule)
> > >>>>>>> {
> > >>>>>>>         struct selinux_state *state = &selinux_state;
> > >>>>>>>         struct policydb *policydb = &state->ss->policydb;
> > >>>>>>>         struct selinux_audit_rule *tmprule;
> > >>>>>>>         struct role_datum *roledatum;
> > >>>>>>>         struct type_datum *typedatum;
> > >>>>>>>         struct user_datum *userdatum;
> > >>>>>>>         struct selinux_audit_rule **rule = (struct selinux_audit_rule **)vrule;
> > >>>>>>>         int rc = 0;
> > >>>>>>>
> > >>>>>>>         *rule = NULL;
> > >>>>>> *rule is set to NULL here, which means the rule on IMA side is also NULL.
> > >>>>>>>
> > >>>>>>>         if (!state->initialized)
> > >>>>>>>                 return -EOPNOTSUPP;
> > >>>>>> ...
> > >>>>>>> out:
> > >>>>>>>         read_unlock(&state->ss->policy_rwlock);
> > >>>>>>>
> > >>>>>>>         if (rc) {
> > >>>>>>>                 selinux_audit_rule_free(tmprule);
> > >>>>>>>                 tmprule = NULL;
> > >>>>>>>         }
> > >>>>>>>
> > >>>>>>>         *rule = tmprule;
> > >>>>>> rule is updated at the end of the function.
> > >>>>>>>
> > >>>>>>>         return rc;
> > >>>>>>> }
> > >>>>>>
> > >>>>>> security/integrity/ima/ima_policy.c:
> > >>>>>>> static bool ima_match_rules(struct ima_rule_entry *rule, struct inode *inode,
> > >>>>>>>                             const struct cred *cred, u32 secid,
> > >>>>>>>                             enum ima_hooks func, int mask)
> > >>>>>>> {...
> > >>>>>>> for (i = 0; i < MAX_LSM_RULES; i++) {
> > >>>>>>>                 int rc = 0;
> > >>>>>>>                 u32 osid;
> > >>>>>>>                 int retried = 0;
> > >>>>>>>
> > >>>>>>>                 if (!rule->lsm[i].rule)
> > >>>>>>>                         continue;
> > >>>>>> Setting rule to NULL would lead to LSM based rule matching being skipped.
> > >>>>>>> retry:
> > >>>>>>>                 switch (i) {
> > >>>>>>
> > >>>>>> To solve this issue, there are multiple approaches we might take and I
> > >>>>>> would like some input from the community.
> > >>>>>>
> > >>>>>> The first proposed solution would be to change
> > >>>>>> selinux_audit_rule_init(). Remove the set to NULL bit and update the
> > >>>>>> rule pointer with cmpxchg.
> > >>>>>>
> > >>>>>>> diff --git a/security/selinux/ss/services.c b/security/selinux/ss/services.c
> > >>>>>>> index a9f2bc8443bd..aa74b04ccaf7 100644
> > >>>>>>> --- a/security/selinux/ss/services.c
> > >>>>>>> +++ b/security/selinux/ss/services.c
> > >>>>>>> @@ -3297,10 +3297,9 @@ int selinux_audit_rule_init(u32 field, u32 op, char *rulestr, void **vrule)
> > >>>>>>>         struct type_datum *typedatum;
> > >>>>>>>         struct user_datum *userdatum;
> > >>>>>>>         struct selinux_audit_rule **rule = (struct selinux_audit_rule **)vrule;
> > >>>>>>> +       struct selinux_audit_rule *orig = rule;
> > >>>>>>>         int rc = 0;
> > >>>>>>>
> > >>>>>>> -       *rule = NULL;
> > >>>>>>> -
> > >>>>>>>         if (!state->initialized)
> > >>>>>>>                 return -EOPNOTSUPP;
> > >>>>>>>
> > >>>>>>> @@ -3382,7 +3381,8 @@ int selinux_audit_rule_init(u32 field, u32 op, char *rulestr, void **vrule)
> > >>>>>>>                 tmprule = NULL;
> > >>>>>>>         }
> > >>>>>>>
> > >>>>>>> -       *rule = tmprule;
> > >>>>>>> +       if (cmpxchg(rule, orig, tmprule) != orig)
> > >>>>>>> +               selinux_audit_rule_free(tmprule);
> > >>>>>>>
> > >>>>>>>         return rc;
> > >>>>>>>  }
> > >>>>>>
> > >>>>>> This solution would be an easy fix, but might influence other modules
> > >>>>>> calling selinux_audit_rule_init() directly or indirectly (on 4.19 LTS,
> > >>>>>> only auditfilter and IMA it seems). And it might be worth returning an
> > >>>>>> error code such as -EAGAIN.
> > >>>>>>
> > >>>>>> Or, we can access rules via RCU, similar to what we do on 5.10. This
> > >>>>>> could means more code change and testing.
> > >>>>>
> > >>>>> In the 4.19 kernel, IMA is doing a lazy LSM based policy rule update as
> > >>>>> needed.  IMA waits for selinux_audit_rule_init() to complete and
> > >>>>> shouldn't see NULL, unless there is an SELinux failure.  Before
> > >>>>> "fixing" the problem, what exactly is the problem?
> > >>>>
> > >>>> IMA runs on multiple cores. On 4.19 kernel, IMA do a lazy update on ALL
> > >>>> LSM based rules in one go without using RCU, which would still allow
> > >>>> other cores to access the rule being updated. And that's the issue.
> > >>>>
> > >>>> An example scenario would be:
> > >>>>  CPU1                    |       CPU2
> > >>>> opened a file and starts |
> > >>>> updating LSM based rules.        |
> > >>>>                          | opened a file and starts
> > >>>>                          | matching rules.
> > >>>>                          |
> > >>>> set a LSM based rule to NULL.    | access the same LSM based rule and
> > >>>>                                  | see that it's NULL.
> > >>>>
> > >>>> In this situation, CPU 2 would recognize this rule as not LSM based and
> > >>>> ignore the LSM part of the rule while matching.
> > >>>
> > >>> Would picking up just ima_lsm_update_rule(), without changing to the
> > >>> lsm policy update notifier, from upstream and calling it from
> > >>> ima_lsm_update_rules() resolve the RCU locking issue?  Or are there
> > >>> other issues?
> > >>
> > >> Hi Mimi,
> > >>
> > >> That should resolve the issue just fine. However, that'd mean having a
> > >> customized ima_lsm_update_rules as well as a customized
> > >> ima_lsm_update_rule functions on 4.19.y, potentially decrease the
> > >> maintainability. The customization of the two functions mentioned above
> > >> would be to ripoff the RCU part so that we can leave the other rule-list
> > >> access untouched.
> > >
> > > Hi Scott,
> > >
> > > Neither do we want to backport new functionality.  Perhaps it is only a
> > > subset of commit b16942455193 ("ima: use the lsm policy update
> > > notifier").
> > Yes we are able to backport part of this commit to get a mainline-like
> > behavior which would solve the issue at hand as well. However from a
> > maintenance standpoint it might be better to form a different commit and
> > not re-use the commit message from mainline commit.
>
> I assume that is fine, but cherry-pick the original commit with the -x
> option, so there is a correlation to the upstream commit.  The patch
> description would mention that the patch is a partial backport.

FWIW, if the changes in the backport are significant I tend to use the
following approach as it captures both the original commit as well as
the details on what changes were made and why.

>>>
ima: use the lsm policy update notifier

Really good explanation of what changes were necessary from the
original patch, including why they were necessary in the first place.

   commit b169424551930a9325f700f502802f4d515194e5
   Author: Janne Karhunen <janne.karhunen@xxxxxxxxx>
   Date:   Fri Jun 14 15:20:15 2019 +0300

   ima: use the lsm policy update notifier

   Don't do lazy policy updates while running the rule matching,
   run the updates as they happen.

   Depends on commit f242064c5df3 ("LSM: switch to blocking policy update
                                   notifiers")

   Signed-off-by: Janne Karhunen <janne.karhunen@xxxxxxxxx>
   Signed-off-by: Mimi Zohar <zohar@xxxxxxxxxxxxx>
>>>

> > > Although your suggested solution of using "cmpxchg" isn't needed in
> > > recent kernels,  it wouldn't hurt either, right?  Assuming that Paul
> > > would be willing to accept it, that could be another option.
> > The cmpxchg part is merely to avoid multiple updates on the same rule
> > item. However I am not so sure why SELinux would set the rule to NULL.
> > My guess is that SELinux is trying to stop others from using an
> > invalidated rule?
> >
> > Would Paul be able to provide some insight? as well as some Comment on
> > the proposed solution?

I'm not comfortable with what might happen with a cmpxchg assignment
when multiple threads are in a related RCU critical section; I'm
assuming they would see the new value immediately (it is atomic,
right?), which I imagine could cause some consistency problems.
However, if someone who understands the intersection of cmpxchg/RCU
better than I do can assure me this isn't a problem we can consider
it.

How bad is the backport really?  Perhaps it is worth doing it to see
what it looks like?

-- 
paul-moore.com



[Index of Archives]     [Selinux Refpolicy]     [Linux SGX]     [Fedora Users]     [Fedora Desktop]     [Yosemite Photos]     [Yosemite Camping]     [Yosemite Campsites]     [KDE Users]     [Gnome Users]

  Powered by Linux