So I happened to do my occasional "let's see what a profile for an empty kernel build is", which usually highlights some silly thing we do in pathname lookup (because most of the cost is just 'make' doing various string handling in user space, and doing variations of 'lstat()' to look at file timestamps). And, as always, avc_has_perm() and avc_has_perm_noaudit() are pretty high up there, together with selinux_inode_permission(). It just gets called a *lot*. Now, avc_has_perm_noaudit() should be inlined, but it turns out that it isn't because of some bad behavior of the unlikely path. That part looks fairly easy to massage away. I'm attaching a largely untested patch that makes the generated code look a bit better, in case anybody wants to look at it. But the real reason for this email is to query about 'selinux_state'. We pass that around as a pointer quite a bit, to the point where several function calls have to use stack space for arguments just because they have more than six arguments. And from what I can tell, it is *always* just a pointer to 'selinux_state', and never anything else. Some cases are obvious: git grep -w avc_has_perm | grep -v 'avc_has_perm(&selinux_state,' ie every _single_ callers of 'avc_has_perm()' just passes in a pointer to that single selinux_state thing. Some cases are a bit less obvious, like security_get_user_sids(), which has sel_write_user() do struct selinux_state *state = fsi->state; .. rc = security_sid_to_context(state, sids[i], &newcon, &len); and then in turn it passes that to avc_has_perm_noaudit(), so it looks like we might actually have multiple states. The emphasis here being on "looks like", because it turns out that the only thing that assigns 'fsi->state' is selinux_fs_info_create(), which does fsi->state = &selinux_state; oh, look, it's that same &selinux_state again! In general, I really think there is just one single struct selinux_state selinux_state; declared in security/selinux/hooks.c, and everybody just pointlessly passes in a pointer to that single thing. This was all done five years ago by commit aa8e712cee93 ("selinux: wrap global selinux state") and I don't see the point. It never seems to have gone anywhere, there's no other state that I can find, and all it does is add overhead and complexity. So I'd like to undo that, and go back to the good old days when we didn't waste time and effort on passing a pointer that always has the same value as an argument. Comments? Is there some case I've missed? Linus
From 50d32ecb066a820146f5dc441185c0e03c11b3e5 Mon Sep 17 00:00:00 2001 From: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> Date: Sat, 28 Jan 2023 12:53:26 -0800 Subject: [PATCH] selinux: uninline unlikely parts of avc_has_perm_noaudit() avc_has_perm_noaudit()is one of those hot functions that end up being used by almost all filesystem operations (through "avc_has_perm()") and it's intended to be cheap enough to inline. However, it turns out that the unlikely parts of it (where it doesn't find an existing avc node) need a fair amount of stack space for the automatic replacement node, so if it were to be inlined (at least clang does not) it would just use stack space unnecessarily. So split the unlikely part out of it, and mark that part noinline. That improves the actual likely part. Signed-off-by: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> --- security/selinux/avc.c | 52 +++++++++++++++++++++++++++++++++--------- 1 file changed, 41 insertions(+), 11 deletions(-) diff --git a/security/selinux/avc.c b/security/selinux/avc.c index 9a43af0ebd7d..85e5af7f856e 100644 --- a/security/selinux/avc.c +++ b/security/selinux/avc.c @@ -1112,6 +1112,38 @@ int avc_has_extended_perms(struct selinux_state *state, return rc; } +/* + * This is the "we have no node" part of avc_has_perm_noaudit(), + * which is unlikely and needs extra stack space for the new + * node that we generate. So don't inline it. + * + * NOTE! Called with RCU lock held for reading, and needs to + * drop it. And since the RCU lock was for the node lifetime + * (which we don't have), we could just drop it early. Except + * avc_compute_av() knows it's called under the rcu lock, and + * drops and re-takes it. What a crock. + * + * So we drop it after calling avc_compute_av() (and *inside* + * avc_compute_av()). + */ +static noinline int avc_perm_nonode(struct selinux_state *state, + u32 ssid, u32 tsid, + u16 tclass, u32 requested, + unsigned int flags, + struct av_decision *avd) +{ + u32 denied; + struct avc_xperms_node xp_node; + + avc_compute_av(state, ssid, tsid, tclass, avd, &xp_node); + rcu_read_unlock(); + denied = requested & ~(avd->allowed); + if (unlikely(denied)) + return avc_denied(state, ssid, tsid, tclass, requested, 0, 0, + flags, avd); + return 0; +} + /** * avc_has_perm_noaudit - Check permissions but perform no auditing. * @state: SELinux state @@ -1139,10 +1171,8 @@ inline int avc_has_perm_noaudit(struct selinux_state *state, unsigned int flags, struct av_decision *avd) { - struct avc_node *node; - struct avc_xperms_node xp_node; - int rc = 0; u32 denied; + struct avc_node *node; if (WARN_ON(!requested)) return -EACCES; @@ -1151,17 +1181,17 @@ inline int avc_has_perm_noaudit(struct selinux_state *state, node = avc_lookup(state->avc, ssid, tsid, tclass); if (unlikely(!node)) - avc_compute_av(state, ssid, tsid, tclass, avd, &xp_node); - else - memcpy(avd, &node->ae.avd, sizeof(*avd)); + return avc_perm_nonode(state, ssid, tsid, tclass, requested, flags, avd); + + denied = requested & ~node->ae.avd.allowed; + memcpy(avd, &node->ae.avd, sizeof(*avd)); + rcu_read_unlock(); - denied = requested & ~(avd->allowed); if (unlikely(denied)) - rc = avc_denied(state, ssid, tsid, tclass, requested, 0, 0, - flags, avd); + return avc_denied(state, ssid, tsid, tclass, requested, 0, 0, + flags, avd); - rcu_read_unlock(); - return rc; + return 0; } /** -- 2.39.0.rc2.4.g1435931e43