On 11/21/19 6:09 AM, Jeff Vander Stoep wrote:
This replaces the reverse table lookup and reverse cache with a
hashtable which improves cache-miss reverse-lookup times from
O(n) to O(1)* and maintains the same performance as a reverse
cache hit.
This reduces the time needed to add a new sidtab entry from ~500us
to 5us on a Pixel 3 when there are ~10,000 sidtab entries.
The implementation uses the kernel's generic hashtable API,
It uses the context's string represtation as the hash source,
and the kernels generic string hashing algorithm full_name_hash()
to reduce the string to a 32 bit value.
Sidtab entries are never modified or removed, so the context struct
stored in the sid_to_context tree can also be used for the
context_to_sid hashtable to reduce memory usage.
This bug was reported by:
- On the selinux bug tracker.
BUG: kernel softlockup due to too many SIDs/contexts #37
https://github.com/SELinuxProject/selinux-kernel/issues/37
- Jovana Knezevic on Android's bugtracker.
Bug: 140252993
"During multi-user performance testing, we create and remove users
many times. selinux_android_restorecon_pkgdir goes from 1ms to over
20ms after about 200 user creations and removals. Accumulated over
~280 packages, that adds a significant time to user creation,
making perf benchmarks unreliable."
* Hashtable lookup is only O(1) when n < the number of buckets.
Changes in V2:
-The hashtable uses sidtab_entry_leaf objects directly so these
objects are shared between the sid_to_context lookup tree and the
context_to_sid hashtable. This simplifies memory allocation and
was suggested by Ondrej Mosnacek.
-The new sidtab hash stats file in selinuxfs has been moved out of
the avc dir and into a new "ss" dir.
V3:
-Add lock nesting notation.
V4/V5:
-Moved to *_rcu variants of the various hashtable functions
as suggested by Will Deacon.
-Naming/spelling fixups.
V6
-Remove nested locking. Use lock of active sidtab to gate
access to the new sidtab.
-Remove use of rcu_head/kfree_rcu(), they're unnecessary because
hashtable objects are never removed when read/add operations are
occurring. Why is this safe? Quoting Ondrej Mosnacek from the
selinux mailing list:
"It is not visible in this patch, but the sidtab (along with other
policy-lifetime structures) is protected by a big fat read-write lock.
The only places where sidtab_destroy() is called are (a) error paths
when initializing a new sidtab (here the sidtab isn't shared yet, so
no race) and (b) when freeing the old sidtab during policy reload - in
this case it is happening after a policy write-locked critical
section, which had removed the old sidtab pointer from the shared
structures, so at that point all sidtab readers will already be
accessing the new sidtab and the old one is visible only by the thread
doing the destruction."
V7
-Change format of /sys/fs/selinux/ss/sidtab_hash_stats to match
/sys/fs/selinux/avc/hash_stats.
-Add __rcu annotation to rcu pointers.
-Test with CONFIG_SPARSE_RCU_POINTER and CONFIG_PROVE_RCU.
-Add rcu@xxxxxxxxxxxxxxx and Paul McKenney to Cc for review of the
RCU logic.
V8
-Removed the __rcu annotation added in V7. The annotation is
intended to be applied to pointers to an object, however the
objects referenced in the rcu hashtable are allocated and
stored in an array.
-Fixed bug where multiple SIDs were receiving the same hash
due to security_get_user_sids() reusing the same context
struct without calling context_init() on it. This bug was
discovered and root-caused by Stephen Smalley.
-Fixed two more places where the context hash needs to be
added in convert_context().
-Moved two more function calls from sidtab_context_to_sid()
to context_struct_to_sid().
-Verified on Android and Fedora 31.
Signed-off-by: Jeff Vander Stoep <jeffv@xxxxxxxxxx>
Reported-by: Jovana Knezevic <jovanak@xxxxxxxxxx>
---
security/selinux/Kconfig | 12 ++
security/selinux/include/security.h | 1 +
security/selinux/selinuxfs.c | 65 +++++++
security/selinux/ss/context.h | 11 +-
security/selinux/ss/policydb.c | 5 +
security/selinux/ss/services.c | 98 ++++++++---
security/selinux/ss/services.h | 4 +-
security/selinux/ss/sidtab.c | 263 ++++++++++++++--------------
security/selinux/ss/sidtab.h | 16 +-
9 files changed, 308 insertions(+), 167 deletions(-)
diff --git a/security/selinux/ss/services.c b/security/selinux/ss/services.c
index 3a29e7c24ba9..5ccae6ac8cab 100644
--- a/security/selinux/ss/services.c
+++ b/security/selinux/ss/services.c
@@ -1384,6 +1395,8 @@ static int string_to_context_struct(struct policydb *pol,
int rc = 0;
context_init(ctx);
+ /* hash the string before it gets mutated */
+ ctx->hash = context_compute_hash(scontext);
/* Parse the security context. */
I still don't think this is correct/safe. The problem is that
string_to_context_struct() can be called with a security context string
in a non-canonical form, i.e. not the same string that would be produced
by context_struct_to_string() on the resulting context. This can occur
if a userspace program passes in a context that uses an alias name or
that uses a non-compact form of the category set (e.g. c0,c1,c2 versus
c0.c2). I think we can only set the hashes from the strings produced by
context_struct_to_string() if we want to ensure consistency. If we
leave the hash unset here, it will still be set later by
context_struct_to_sid() or convert_context() so it should be fine.