On Tue, Mar 14, 2023 at 4:43 PM Paul Moore <paul@xxxxxxxxxxxxxx> wrote: > > On Tue, Mar 14, 2023 at 9:03 AM Stephen Smalley > <stephen.smalley.work@xxxxxxxxx> wrote: > > > > I think Linus suggested this a long long time ago but I never got around > > to trying it. Better late than never. Compute the access vector decision > > when the inode security blob is initialized and cache it in the blob. > > Update it on file opens. In selinux_inode_permission and inode_has_perm, > > use this cached decision unless invalidated. > > > > Suggested-by: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> > > Signed-off-by: Stephen Smalley <stephen.smalley.work@xxxxxxxxx> > > --- > > There is an obvious race here; we need some form of synchronization > > to ensure consistency of isec->task_sid, isec->avdsid, and > > isec->avd since otherwise we could end up using the wrong access > > vector decision if e.g. two tasks with different SIDs are accessing > > the same file. Doing so safely without ending up with worse > > overhead for selinux_inode_permission and inode_has_perm requires > > care; open to suggestions here. > > Unfortunately, I think we're stuck having to take the > inode_security_struct's spinlock to synchronize updates; or rather, > nothing clever is coming to mind that doesn't introduce a bunch of > unwanted complexity/overhead. > > With that in mind, I'm thinking more about where we might do an update > when we already have the spinlock held. The obvious first candidate > is inode_doinit_with_dentry(), but I believe you previously mentioned > that inode_doinit_with_dentry() alone didn't have much of an impact. > The other place that comes to mind is in __inode_security_revalidate() > as this is already called from a fair number of places that I think we > would care about, and it has the advantage of a parameter which > signals if it is safe to sleep. We already call > inode_doinit_with_dentry() there when it is safe and the inode needs > to be revalidated, why not update the cached AVC related fields if it > is safe and the inode is already valid? I'm not sure if this will > result in too much cache thrashing in the inode_security_struct, or if > the spinlock overhead will be unwanted in all the > __inode_security_revalidate() callers, but I figured it was worth > mentioning in case you hadn't already played with it. We are already taking the isec->lock (or otherwise have exclusive access to a newly initialized isec) in all the places where we are updating the isec->avd and isec->avdsid. The issue is not the updates but rather the reads in inode_has_perm() and selinux_inode_permission(). Would prefer some lockless synchronization there as otherwise I fear we are making things worse rather than better. Need to ensure that the (isec->task_sid, isec->avdsid, isec->avd) triple is consistent (i.e. that the isec->avd represents the access vector decisions computed based on the isec->task_sid and isec->avdsid) and that the (sid, isec->sid, avc_policy_seqno()) triple matches the (isec->task_sid, isec->avdsid, isec->avd.seqno) triple in order to use that avd. > > diff --git a/security/selinux/avc.c b/security/selinux/avc.c > > index c162e51fb43c..c74bdd76b38a 100644 > > --- a/security/selinux/avc.c > > +++ b/security/selinux/avc.c > > @@ -357,6 +357,8 @@ static int avc_xperms_populate(struct avc_node *node, > > struct avc_xperms_decision_node *dest_xpd; > > struct avc_xperms_decision_node *src_xpd; > > > > + if (!src) > > + return 0; > > This feels like something that we might want to do anyway, although I > would probably combine it with the check immediately below: > > if (!src || src->xp.len == 0) > return 0; Yes, can do that in the next revision. > > @@ -1121,6 +1125,35 @@ static noinline int avc_perm_nonode(u32 ssid, u32 tsid, u16 tclass, > > return 0; > > } > > > > +/** > > + * avc_get_avd - Get access vector decisions > > + * @ssid: source security identifier > > + * @tsid: target security identifier > > + * @tclass: target security class > > + * @avd: access vector decisions > > + * > > + * Get access vector decisions for the specified (@ssid, @tsid, @tclass) > > + * triple, fetching them from the access vector cache if present or > > + * calling the security server to compute them on a miss. Unlike > > + * avc_has_perm_noaudit(), this function does not check any > > + * requested permission; it just returns the entire decision vector. > > + */ > > +void avc_get_avd(u32 ssid, u32 tsid, u16 tclass, struct av_decision *avd) > > +{ > > + struct avc_node *node; > > + > > + rcu_read_lock(); > > + node = avc_lookup(ssid, tsid, tclass); > > + if (unlikely(!node)) { > > + rcu_read_unlock(); > > + avc_compute_av(ssid, tsid, tclass, avd, NULL); > > + return; > > Out of curiosity, did you do any measurements without the > avc_compute_av() call, relying solely on a previously computed entry > in the AVC? The approach here is to call avc_get_avd() on isec initialization/update and upon file open to cache the access vector decisions at that time for later use upon inode_has_perm and selinux_inode_permission. So avc_get_avd() is just avc_has_perm_noaudit() w/o actually checking any specific requested permissions. Like avc_has_perm_noaudit(), it first tries to look up the entry in the AVC and only falls back to avc_compute_av() on a cache miss. avc_get_avd() is not getting called during selinux_inode_permission itself. If we don't call avc_compute_av() in the cache miss case, then we'll have to do so at selinux_inode_permission and inode_has_perm() time, in which case we'll pay a higher cost there and that's the more performance-critical path. > > diff --git a/security/selinux/ss/services.c b/security/selinux/ss/services.c > > index f14d1ffe54c5..7353c027c389 100644 > > --- a/security/selinux/ss/services.c > > +++ b/security/selinux/ss/services.c > > @@ -1107,7 +1107,8 @@ void security_compute_av(u32 ssid, > > rcu_read_lock(); > > policy = rcu_dereference(selinux_state.policy); > > avd_init(policy, avd); > > - xperms->len = 0; > > + if (xperms) > > + xperms->len = 0; > > Similar to the comment for avc_xperms_populate(), I wonder if this is > something we want to check regardless. Agree, just wasn't needed until this change.