Looking at profile data once again - avc lookup

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



So I happened to do my occasional "let's see what a profile for an
empty kernel build is", which usually highlights some silly thing we
do in pathname lookup (because most of the cost is just 'make' doing
various string handling in user space, and doing variations of
'lstat()' to look at file timestamps).

And, as always, avc_has_perm() and avc_has_perm_noaudit() are pretty
high up there, together with selinux_inode_permission(). It just gets
called a *lot*.

Now, avc_has_perm_noaudit() should be inlined, but it turns out that
it isn't because of some bad behavior of the unlikely path. That part
looks fairly easy to massage away. I'm attaching a largely untested
patch that makes the generated code look a bit better, in case anybody
wants to look at it.

But the real reason for this email is to query about 'selinux_state'.

We pass that around as a pointer quite a bit, to the point where
several function calls have to use stack space for arguments just
because they have more than six arguments. And from what I can tell,
it is *always* just a pointer to 'selinux_state', and never anything
else.

Some cases are obvious:

    git grep -w avc_has_perm |
        grep -v 'avc_has_perm(&selinux_state,'

ie every _single_ callers of 'avc_has_perm()' just passes in a pointer
to that single selinux_state thing.

Some cases are a bit less obvious, like security_get_user_sids(),
which has sel_write_user() do

        struct selinux_state *state = fsi->state;
        ..
                rc = security_sid_to_context(state, sids[i], &newcon, &len);

and then in turn it passes that to avc_has_perm_noaudit(), so it looks
like we might actually have multiple states.

The emphasis here being on "looks like", because it turns out that the
only thing that assigns 'fsi->state' is selinux_fs_info_create(),
which does

        fsi->state = &selinux_state;

oh, look, it's that same &selinux_state again!

In general, I really think there is just one single

        struct selinux_state selinux_state;

declared in security/selinux/hooks.c, and everybody just pointlessly
passes in a pointer to that single thing.

This was all done five years ago by commit aa8e712cee93 ("selinux:
wrap global selinux state") and I don't see the point. It never seems
to have gone anywhere, there's no other state that I can find, and all
it does is add overhead and complexity.

So I'd like to undo that, and go back to the good old days when we
didn't waste time and effort on passing a pointer that always has the
same value as an argument.

Comments? Is there some case I've missed?

               Linus
From 50d32ecb066a820146f5dc441185c0e03c11b3e5 Mon Sep 17 00:00:00 2001
From: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
Date: Sat, 28 Jan 2023 12:53:26 -0800
Subject: [PATCH] selinux: uninline unlikely parts of avc_has_perm_noaudit()

avc_has_perm_noaudit()is one of those hot functions that end up being
used by almost all filesystem operations (through "avc_has_perm()") and
it's intended to be cheap enough to inline.

However, it turns out that the unlikely parts of it (where it doesn't
find an existing avc node) need a fair amount of stack space for the
automatic replacement node, so if it were to be inlined (at least clang
does not) it would just use stack space unnecessarily.

So split the unlikely part out of it, and mark that part noinline.  That
improves the actual likely part.

Signed-off-by: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
---
 security/selinux/avc.c | 52 +++++++++++++++++++++++++++++++++---------
 1 file changed, 41 insertions(+), 11 deletions(-)

diff --git a/security/selinux/avc.c b/security/selinux/avc.c
index 9a43af0ebd7d..85e5af7f856e 100644
--- a/security/selinux/avc.c
+++ b/security/selinux/avc.c
@@ -1112,6 +1112,38 @@ int avc_has_extended_perms(struct selinux_state *state,
 	return rc;
 }
 
+/*
+ * This is the "we have no node" part of avc_has_perm_noaudit(),
+ * which is unlikely and needs extra stack space for the new
+ * node that we generate. So don't inline it.
+ *
+ * NOTE! Called with RCU lock held for reading, and needs to
+ * drop it. And since the RCU lock was for the node lifetime
+ * (which we don't have), we could just drop it early. Except
+ * avc_compute_av() knows it's called under the rcu lock, and
+ * drops and re-takes it. What a crock.
+ *
+ * So we drop it after calling avc_compute_av() (and *inside*
+ * avc_compute_av()).
+ */
+static noinline int avc_perm_nonode(struct selinux_state *state,
+				u32 ssid, u32 tsid,
+				u16 tclass, u32 requested,
+				unsigned int flags,
+				struct av_decision *avd)
+{
+		u32 denied;
+		struct avc_xperms_node xp_node;
+
+		avc_compute_av(state, ssid, tsid, tclass, avd, &xp_node);
+		rcu_read_unlock();
+		denied = requested & ~(avd->allowed);
+		if (unlikely(denied))
+			return avc_denied(state, ssid, tsid, tclass, requested, 0, 0,
+					flags, avd);
+		return 0;
+}
+
 /**
  * avc_has_perm_noaudit - Check permissions but perform no auditing.
  * @state: SELinux state
@@ -1139,10 +1171,8 @@ inline int avc_has_perm_noaudit(struct selinux_state *state,
 				unsigned int flags,
 				struct av_decision *avd)
 {
-	struct avc_node *node;
-	struct avc_xperms_node xp_node;
-	int rc = 0;
 	u32 denied;
+	struct avc_node *node;
 
 	if (WARN_ON(!requested))
 		return -EACCES;
@@ -1151,17 +1181,17 @@ inline int avc_has_perm_noaudit(struct selinux_state *state,
 
 	node = avc_lookup(state->avc, ssid, tsid, tclass);
 	if (unlikely(!node))
-		avc_compute_av(state, ssid, tsid, tclass, avd, &xp_node);
-	else
-		memcpy(avd, &node->ae.avd, sizeof(*avd));
+		return avc_perm_nonode(state, ssid, tsid, tclass, requested, flags, avd);
+
+	denied = requested & ~node->ae.avd.allowed;
+	memcpy(avd, &node->ae.avd, sizeof(*avd));
+	rcu_read_unlock();
 
-	denied = requested & ~(avd->allowed);
 	if (unlikely(denied))
-		rc = avc_denied(state, ssid, tsid, tclass, requested, 0, 0,
-				flags, avd);
+		return avc_denied(state, ssid, tsid, tclass, requested, 0, 0,
+					flags, avd);
 
-	rcu_read_unlock();
-	return rc;
+	return 0;
 }
 
 /**
-- 
2.39.0.rc2.4.g1435931e43


[Index of Archives]     [Selinux Refpolicy]     [Linux SGX]     [Fedora Users]     [Fedora Desktop]     [Yosemite Photos]     [Yosemite Camping]     [Yosemite Campsites]     [KDE Users]     [Gnome Users]

  Powered by Linux