On Fri, Nov 22, 2019 at 4:33 AM Jeff Vander Stoep <jeffv@xxxxxxxxxx> wrote: > This replaces the reverse table lookup and reverse cache with a > hashtable which improves cache-miss reverse-lookup times from > O(n) to O(1)* and maintains the same performance as a reverse > cache hit. > > This reduces the time needed to add a new sidtab entry from ~500us > to 5us on a Pixel 3 when there are ~10,000 sidtab entries. > > The implementation uses the kernel's generic hashtable API, > It uses the context's string represtation as the hash source, > and the kernels generic string hashing algorithm full_name_hash() > to reduce the string to a 32 bit value. > > This change also maintains the improvement introduced in > commit ee1a84fdfeed ("selinux: overhaul sidtab to fix bug and improve > performance") which removed the need to keep the current sidtab > locked during policy reload. It does however introduce periodic > locking of the target sidtab while converting the hashtable. Sidtab > entries are never modified or removed, so the context struct stored > in the sid_to_context tree can also be used for the context_to_sid > hashtable to reduce memory usage. > > This bug was reported by: > - On the selinux bug tracker. > BUG: kernel softlockup due to too many SIDs/contexts #37 > https://github.com/SELinuxProject/selinux-kernel/issues/37 > - Jovana Knezevic on Android's bugtracker. > Bug: 140252993 > "During multi-user performance testing, we create and remove users > many times. selinux_android_restorecon_pkgdir goes from 1ms to over > 20ms after about 200 user creations and removals. Accumulated over > ~280 packages, that adds a significant time to user creation, > making perf benchmarks unreliable." > > * Hashtable lookup is only O(1) when n < the number of buckets. > > Changes in V2: > -The hashtable uses sidtab_entry_leaf objects directly so these > objects are shared between the sid_to_context lookup tree and the > context_to_sid hashtable. This simplifies memory allocation and > was suggested by Ondrej Mosnacek. > -The new sidtab hash stats file in selinuxfs has been moved out of > the avc dir and into a new "ss" dir. > > V3: > -Add lock nesting notation. > > V4/V5: > -Moved to *_rcu variants of the various hashtable functions > as suggested by Will Deacon. > -Naming/spelling fixups. > > V6 > -Remove nested locking. Use lock of active sidtab to gate > access to the new sidtab. > -Remove use of rcu_head/kfree_rcu(), they're unnecessary because > hashtable objects are never removed when read/add operations are > occurring. Why is this safe? Quoting Ondrej Mosnacek from the > selinux mailing list: > "It is not visible in this patch, but the sidtab (along with other > policy-lifetime structures) is protected by a big fat read-write lock. > The only places where sidtab_destroy() is called are (a) error paths > when initializing a new sidtab (here the sidtab isn't shared yet, so > no race) and (b) when freeing the old sidtab during policy reload - in > this case it is happening after a policy write-locked critical > section, which had removed the old sidtab pointer from the shared > structures, so at that point all sidtab readers will already be > accessing the new sidtab and the old one is visible only by the thread > doing the destruction." > > V7 > -Change format of /sys/fs/selinux/ss/sidtab_hash_stats to match > /sys/fs/selinux/avc/hash_stats. > -Add __rcu annotation to rcu pointers. > -Test with CONFIG_SPARSE_RCU_POINTER and CONFIG_PROVE_RCU. > -Add rcu@xxxxxxxxxxxxxxx and Paul McKenney to Cc for review of the > RCU logic. > > V8 > -Removed the __rcu annotation used in V7. The annotation is > intended to be applied to pointers to an object, however the > objects referenced in the rcu hashtable are allocated in an > array. > -Fixed bug where multiple SIDs were receiving the same hash > due to security_get_user_sids() reusing the same context > struct without calling context_init() on it. This bug was > discovered and root-caused by Stephen Smalley. > > V9 > -Do not compute the hash in string_to_context_struct > because this string representation is non-canonical. > > Signed-off-by: Jeff Vander Stoep <jeffv@xxxxxxxxxx> > Reported-by: Stephen Smalley <sds@xxxxxxxxxxxxx> > Reported-by: Jovana Knezevic <jovanak@xxxxxxxxxx> > --- > security/selinux/Kconfig | 12 ++ > security/selinux/include/security.h | 1 + > security/selinux/selinuxfs.c | 65 +++++++ > security/selinux/ss/context.h | 11 +- > security/selinux/ss/policydb.c | 5 + > security/selinux/ss/services.c | 96 +++++++--- > security/selinux/ss/services.h | 4 +- > security/selinux/ss/sidtab.c | 263 ++++++++++++++-------------- > security/selinux/ss/sidtab.h | 16 +- > 9 files changed, 306 insertions(+), 167 deletions(-) Thanks Jeff, as well as everyone else who contributed reviews and feedback. I've pulled this into a working branch and I'll be merging it with the other sidtab patches before posting it to a "next-queue" branch for review later this week. When done, I'll send a note to the list, as well as the relevant patch authors; your help in reviewing the merge would be greatly appreciated. -- paul moore www.paul-moore.com