Re: [PATCH v2] selinux: sidtab: reverse lookup hash table

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Nov 5, 2019 at 2:28 PM Ondrej Mosnacek <omosnace@xxxxxxxxxx> wrote:
>
> On Mon, Nov 4, 2019 at 8:15 PM Jeff Vander Stoep <jeffv@xxxxxxxxxx> wrote:
> > This replaces the reverse table lookup and reverse cache with a
> > hashtable which improves cache-miss reverese-lookup times from
> > O(n) to O(1)* and maintains the same performance as a reverse
> > cache hit.
> >
> > This reduces the time needed to add a new sidtab entry from ~500us
> > to 5us on a Pixel 3 when there are ~10,000 sidtab entries.
> >
> > The implementation uses the kernel's generic hashtable API,
> > It uses the context's string represtation as the hash source,
> > and the kernels generic string hashing algorithm full_name_hash()
> > to reduce the string to a 32 bit value.
> >
> > This change also maintains the improvement introduced in commit
> > ee1a84fd which removed the need to keep the current sidtab locked
> > during policy reload. It does however introduce periodic locking of
> > the target sidtab while converting the hashtable. Sidtab entries
> > are never modified or removed, so the context struct stored in the
> > sid_to_context tree can also be used for the context_to_sid
> > hashtable to reduce memory usage.
> >
> > This bug was reported by:
> > - Stephen Smally on the selinux bug tracker.
> >   BUG: kernel softlockup due to too many SIDs/contexts #37
> >   https://github.com/SELinuxProject/selinux-kernel/issues/37
> > - Jovana Knezevic on Android's bugtracker.
> >   Bug: 140252993
> >   "During multi-user performance testing, we create and remove users
> >   many times. selinux_android_restorecon_pkgdir goes from 1ms to over
> >   20ms after about 200 user creations and removals. Accumulated over
> >   ~280 packages, that adds a significant time to user creation,
> >   making perf benchmarks unreliable."
> >
> > * Hashtable lookup is only O(1) when n < the number of buckets.
> >
> > Changes in V2:
> > -The hashtable uses sidtab_entry_leaf objects directly so these
> > objects are shared between the sid_to_context lookup tree and the
> > context_to_sid hashtable.
> > -Move the hash_add() statement to after the smp_store_release()
> > to ensure that the contents have been written before the object
> > becomes available in the hashtable.
> > -The new sidtab hash stats file in selinuxfs has been moved out of
> > the avc dir and into a new "ss" dir.
> >
> > Signed-off-by: Jeff Vander Stoep <jeffv@xxxxxxxxxx>
> > Reported-by: Stephen Smalley <sds@xxxxxxxxxxxxx>
> > Reported-by: Jovana Knezevic <jovanak@xxxxxxxxxx>
> > ---
> >  security/selinux/Kconfig            |  12 ++
> >  security/selinux/include/security.h |   1 +
> >  security/selinux/selinuxfs.c        |  64 +++++++
> >  security/selinux/ss/context.h       |  11 +-
> >  security/selinux/ss/policydb.c      |   5 +
> >  security/selinux/ss/services.c      |  87 +++++++---
> >  security/selinux/ss/services.h      |   4 +-
> >  security/selinux/ss/sidtab.c        | 249 +++++++++++++---------------
> >  security/selinux/ss/sidtab.h        |  16 +-
> >  9 files changed, 287 insertions(+), 162 deletions(-)
> >
> > diff --git a/security/selinux/Kconfig b/security/selinux/Kconfig
> > index 5711689deb6a..2d9788ad2d77 100644
> > --- a/security/selinux/Kconfig
> > +++ b/security/selinux/Kconfig
> > @@ -85,3 +85,15 @@ config SECURITY_SELINUX_CHECKREQPROT_VALUE
> >           via /selinux/checkreqprot if authorized by policy.
> >
> >           If you are unsure how to answer this question, answer 0.
> > +
> > +config SECURITY_SELINUX_SIDTAB_HASH_BITS
> > +       int "NSA SELinux sidtab hashtable size"
> > +       depends on SECURITY_SELINUX
> > +       range 8 13
> > +       default 9
> > +       help
> > +         This option sets the number of buckets used in the sitab hashtable
> > +         to 2^SECURITY_SELINUX_SIDTAB_HASH_BITS buckets. The number of hash
> > +         collisions may be viewed at /selinux/ss/sidtab_hash_stats. If chain
> > +         lengths are high (e.g. > 20) than selecting a higher value here will
>
> s/than/then/
>
> > +         ensure that lookups times are fast and stable.
>
> lookup times are short and stable
>
> <snip>
>
> > diff --git a/security/selinux/ss/sidtab.c b/security/selinux/ss/sidtab.c
> > index 7d49994e8d5f..31cdb4848d67 100644
> > --- a/security/selinux/ss/sidtab.c
> > +++ b/security/selinux/ss/sidtab.c
> > @@ -17,26 +17,37 @@
> >  #include "security.h"
> >  #include "sidtab.h"
> >
> > +#define index_to_sid(index) (index + SECINITSID_NUM + 1)
> > +#define sid_to_index(sid) (sid - (SECINITSID_NUM + 1))
> > +
> >  int sidtab_init(struct sidtab *s)
> >  {
> >         u32 i;
> >
> >         memset(s->roots, 0, sizeof(s->roots));
> >
> > -       /* max count is SIDTAB_MAX so valid index is always < SIDTAB_MAX */
> > -       for (i = 0; i < SIDTAB_RCACHE_SIZE; i++)
> > -               s->rcache[i] = SIDTAB_MAX;
> > -
> >         for (i = 0; i < SECINITSID_NUM; i++)
> >                 s->isids[i].set = 0;
> >
> >         s->count = 0;
> >         s->convert = NULL;
> > +       hash_init(s->context_to_sid);
> >
> >         spin_lock_init(&s->lock);
> >         return 0;
> >  }
> >
> > +static u32 context_to_sid(struct sidtab *s, struct context *context)
> > +{
> > +       struct sidtab_entry_leaf *entry;
> > +
> > +       hash_for_each_possible(s->context_to_sid, entry, list, context->hash) {
> > +               if (context_cmp(&entry->context, context))
> > +                       return entry->sid;
>
> For this to be semantically safe against entries being added, it would
> need the hlist head -> first member to be read with
> smp_load_acquire(). Probably in practice you would always get fresh
> data when dereferencing a pointer, but I don't think we should rely on
> that... In fact it looks like hash_for_each_possible() wasn't really
> written with lockless traversal in mind - I checked a few random
> places that call it and they all either do it under some lock or
> expect the table to be immutable. I believe you will need to introduce
> a new equivalents of hash_add()/hlist_add_head() (that does
> smp_store_release() instead of WRITE_ONCE(), because the 'next'
> pointer of the node being added must be written before the head's
> first pointer is set to point to the new node) and
> hash_for_each_possible()/hlist_for_each_entry() (that loads
> (head)->first with smp_load_acquire()).
>
> Then again, I'm not an expert on this synchronization stuff, so I
> might be wrong... But you will have to prove it :)
>
> > +       }
> > +       return 0;
> > +}
> > +
>
> I am a little uneasy about this function having the same name as the
> context_to_sid() in security.c. It could be confusing when debugging.
> Maybe you could name the security.c function to
> context_struct_to_sid()?
>
> >  int sidtab_set_initial(struct sidtab *s, u32 sid, struct context *context)
> >  {
> >         struct sidtab_isid_entry *entry;
>
> <snip>
>
> > @@ -305,29 +275,35 @@ static int sidtab_reverse_lookup(struct sidtab *s, struct context *context,
> >                 rc = -ENOMEM;
> >                 dst_convert = sidtab_do_lookup(convert->target, count, 1);
> >                 if (!dst_convert) {
> > -                       context_destroy(dst);
> > +                       context_destroy(&dst->context);
> >                         goto out_unlock;
> >                 }
> >
> > -               rc = convert->func(context, dst_convert, convert->args);
> > +               rc = convert->func(context, &dst_convert->context,
> > +                               convert->args);
> >                 if (rc) {
> > -                       context_destroy(dst);
> > +                       context_destroy(&dst->context);
> >                         goto out_unlock;
> >                 }
> > +               dst_convert->sid = index_to_sid(count);
> >
> >                 /* at this point we know the insert won't fail */
> > +               spin_lock_irqsave(&convert->target->lock, flags);
> >                 convert->target->count = count + 1;
> > +               hash_add(convert->target->context_to_sid,
> > +                        &dst_convert->list, dst_convert->context.hash);
> > +               spin_unlock_irqrestore(&convert->target->lock, flags);
>
> Do we really need to lock the conversion target here? The
> (undocumented) contract here was that as long as the conversion is in
> progress, the target sidtab is "owned" by the master sidtab and any
> modifications of it would be done under the master's spinlock (the
> target's spinlock wouldn't be used until it is swapped-in to be the
> new master). Did this contract change with your patch? I think this
> nested locking is the cause of the lock warning that Stephen is
> getting so hopefully you can fix it by simply removing this extra
> locking.
>
> <snip>
>
> --
> Ondrej Mosnacek <omosnace at redhat dot com>
> Software Engineer, Security Technologies
> Red Hat, Inc.
>



[Index of Archives]     [Selinux Refpolicy]     [Linux SGX]     [Fedora Users]     [Fedora Desktop]     [Yosemite Photos]     [Yosemite Camping]     [Yosemite Campsites]     [KDE Users]     [Gnome Users]

  Powered by Linux