On Wed, Aug 17, 2022 at 3:26 PM Mimi Zohar <zohar@xxxxxxxxxxxxx> wrote: > On Wed, 2022-08-17 at 15:20 +0800, Guozihua (Scott) wrote: > > On 2022/8/17 15:17, Guozihua (Scott) wrote: > > > On 2022/8/16 6:23, Paul Moore wrote: > > >> On Sun, Aug 14, 2022 at 2:30 PM Mimi Zohar <zohar@xxxxxxxxxxxxx> wrote: > > >>> > > >>> Hi Scott, Paul, > > >>> > > >>> On Tue, 2022-08-09 at 12:24 -0400, Paul Moore wrote: > > >>>> On Sun, Aug 7, 2022 at 11:19 PM Guozihua (Scott) > > >>>> <guozihua@xxxxxxxxxx> wrote: > > >>>>> > > >>>>> On 2022/8/8 11:02, Guozihua (Scott) wrote: > > >>>>>> Hi Community, > > >>>>>> > > >>>>>> Recently we discovered a race condition while updating SELinux policy > > >>>>>> with IMA lsm rule enabled. Which would lead to extra files being > > >>>>>> measured. > > >>>>>> > > >>>>>> While SELinux policy is updated, the IDs for object types and such > > >>>>>> would > > >>>>>> be changed, and ima_lsm_update_rules would be called. > > >>>>>> > > >>>>>> There are no lock applied in ima_lsm_update_rules. If user accesses a > > >>>>>> file during this time, ima_match_rules will be matching rules > > >>>>>> based on > > >>>>>> old SELinux au_seqno resulting in selinux_audit_rule_match returning > > >>>>>> -ESTALE. > > >>>>>> > > >>>>>> However, in ima_match_rules, this error number is not handled, > > >>>>>> causing > > >>>>>> IMA to think the LSM rule is also a match, leading to measuring extra > > >>>>>> files. > > >>>> > > >>>> ... > > >>>> > > >>>>>> Is this the intended behavior? Or is it a good idea to add a lock for > > >>>>>> LSM rules during update? > > >>>> > > >>>> I'm not the IMA expert here, but a lot of effort has been into the > > >>>> SELinux code to enable lockless/RCU SELinux policy access and I > > >>>> *really* don't want to have to backtrack on that. > > >>> > > >>> IMA initially updated it's reference to the SELinux label ids lazily. > > >>> More recently IMA refreshes the LSM label ids based on > > >>> register_blocking_lsm_notifier(). As a result of commit 9ad6e9cb39c6 > > >>> ("selinux: fix race between old and new sidtab"), -ESTALE is now being > > >>> returned. > > >> > > >> To be clear, are you seeing this only started happening after commit > > >> 9ad6e9cb39c6? If that is the case, I would suggest a retry loop > > >> around ima_filter_rule_match() when -ESTALE is returned. I believe > > >> that should resolve the problem, if not please let us know. > > > > > > Hi Mimi and Paul > > > > > > It seems that selinux_audit_rule_match has been returning -ESTALE for a > > > very long time. It dates back to 376bd9cb357ec. > > > > > > IMA used to have a retry mechanism, but it was removed by b16942455193 > > > ("ima: use the lsm policy update notifier"). Maybe we should consider > > > bring it back or just add a lock in ima_lsm_update_rules(). > > > > > > FYI, once ima received the notification, it starts updating all it's lsm > > > rules one-by-one. During this time, calling ima_match_rules on any rule > > > that is not yet updated would return -ESTALE. > > > > I mean a retry might still be needed in ima_match_rules(), but not the > > ima_lsm_update_rules(). > > Ok. So eventually the LSM label ids are properly updated. Did adding > a retry loop around ima_filter_rule_match(), as Paul suggested, resolve > the problem? A good long-term solution to this would likely be to add a small wrapper function for SELinux's security_audit_rule_match() hook (e.g. loop on selinux_audit_rule_match() when ESTALE is returned) so that callers wouldn't need to worry about this, but I first want to make sure that is the problem. If that *is* the problem, I can draft up a SELinux patch pretty quick. -- paul-moore.com