On Tue, Dec 3, 2019 at 1:33 AM Paul Moore <paul@xxxxxxxxxxxxxx> wrote: > On Fri, Nov 22, 2019 at 4:33 AM Jeff Vander Stoep <jeffv@xxxxxxxxxx> wrote: > > This replaces the reverse table lookup and reverse cache with a > > hashtable which improves cache-miss reverse-lookup times from > > O(n) to O(1)* and maintains the same performance as a reverse > > cache hit. > > > > This reduces the time needed to add a new sidtab entry from ~500us > > to 5us on a Pixel 3 when there are ~10,000 sidtab entries. > > > > The implementation uses the kernel's generic hashtable API, > > It uses the context's string represtation as the hash source, > > and the kernels generic string hashing algorithm full_name_hash() > > to reduce the string to a 32 bit value. > > > > This change also maintains the improvement introduced in > > commit ee1a84fdfeed ("selinux: overhaul sidtab to fix bug and improve > > performance") which removed the need to keep the current sidtab > > locked during policy reload. It does however introduce periodic > > locking of the target sidtab while converting the hashtable. Sidtab > > entries are never modified or removed, so the context struct stored > > in the sid_to_context tree can also be used for the context_to_sid > > hashtable to reduce memory usage. > > > > This bug was reported by: > > - On the selinux bug tracker. > > BUG: kernel softlockup due to too many SIDs/contexts #37 > > https://github.com/SELinuxProject/selinux-kernel/issues/37 > > - Jovana Knezevic on Android's bugtracker. > > Bug: 140252993 > > "During multi-user performance testing, we create and remove users > > many times. selinux_android_restorecon_pkgdir goes from 1ms to over > > 20ms after about 200 user creations and removals. Accumulated over > > ~280 packages, that adds a significant time to user creation, > > making perf benchmarks unreliable." > > > > * Hashtable lookup is only O(1) when n < the number of buckets. > > > > Changes in V2: > > -The hashtable uses sidtab_entry_leaf objects directly so these > > objects are shared between the sid_to_context lookup tree and the > > context_to_sid hashtable. This simplifies memory allocation and > > was suggested by Ondrej Mosnacek. > > -The new sidtab hash stats file in selinuxfs has been moved out of > > the avc dir and into a new "ss" dir. > > > > V3: > > -Add lock nesting notation. > > > > V4/V5: > > -Moved to *_rcu variants of the various hashtable functions > > as suggested by Will Deacon. > > -Naming/spelling fixups. > > > > V6 > > -Remove nested locking. Use lock of active sidtab to gate > > access to the new sidtab. > > -Remove use of rcu_head/kfree_rcu(), they're unnecessary because > > hashtable objects are never removed when read/add operations are > > occurring. Why is this safe? Quoting Ondrej Mosnacek from the > > selinux mailing list: > > "It is not visible in this patch, but the sidtab (along with other > > policy-lifetime structures) is protected by a big fat read-write lock. > > The only places where sidtab_destroy() is called are (a) error paths > > when initializing a new sidtab (here the sidtab isn't shared yet, so > > no race) and (b) when freeing the old sidtab during policy reload - in > > this case it is happening after a policy write-locked critical > > section, which had removed the old sidtab pointer from the shared > > structures, so at that point all sidtab readers will already be > > accessing the new sidtab and the old one is visible only by the thread > > doing the destruction." > > > > V7 > > -Change format of /sys/fs/selinux/ss/sidtab_hash_stats to match > > /sys/fs/selinux/avc/hash_stats. > > -Add __rcu annotation to rcu pointers. > > -Test with CONFIG_SPARSE_RCU_POINTER and CONFIG_PROVE_RCU. > > -Add rcu@xxxxxxxxxxxxxxx and Paul McKenney to Cc for review of the > > RCU logic. > > > > V8 > > -Removed the __rcu annotation used in V7. The annotation is > > intended to be applied to pointers to an object, however the > > objects referenced in the rcu hashtable are allocated in an > > array. > > -Fixed bug where multiple SIDs were receiving the same hash > > due to security_get_user_sids() reusing the same context > > struct without calling context_init() on it. This bug was > > discovered and root-caused by Stephen Smalley. > > > > V9 > > -Do not compute the hash in string_to_context_struct > > because this string representation is non-canonical. > > > > Signed-off-by: Jeff Vander Stoep <jeffv@xxxxxxxxxx> > > Reported-by: Stephen Smalley <sds@xxxxxxxxxxxxx> > > Reported-by: Jovana Knezevic <jovanak@xxxxxxxxxx> > > --- > > security/selinux/Kconfig | 12 ++ > > security/selinux/include/security.h | 1 + > > security/selinux/selinuxfs.c | 65 +++++++ > > security/selinux/ss/context.h | 11 +- > > security/selinux/ss/policydb.c | 5 + > > security/selinux/ss/services.c | 96 +++++++--- > > security/selinux/ss/services.h | 4 +- > > security/selinux/ss/sidtab.c | 263 ++++++++++++++-------------- > > security/selinux/ss/sidtab.h | 16 +- > > 9 files changed, 306 insertions(+), 167 deletions(-) > > Thanks Jeff, as well as everyone else who contributed reviews and feedback. > > I've pulled this into a working branch and I'll be merging it with the > other sidtab patches before posting it to a "next-queue" branch for > review later this week. When done, I'll send a note to the list, as > well as the relevant patch authors; your help in reviewing the merge > would be greatly appreciated. I tried doing the merge on my own here [1], you can use it as a sanity check if we came to the same/similar result. I based it off your existing next-queue, which contains only Jeff's patch at the time of writing. I only build-tested it so far. Note that there are two whitespace cleanups included in the string cache commit that I intuitively did while resolving the merge conflicts. You might want to move those to the first commit or just ignore them. [1] https://gitlab.com/omos/linux-public/compare/selinux-next...rebase-selinux-sidtab-string-cache -- Ondrej Mosnacek <omosnace at redhat dot com> Software Engineer, Security Technologies Red Hat, Inc.