Re: [RFC PATCH v4 2/2] selinux: overhaul sidtab to fix bug and improve performance

Ondrej Mosnacek <omosnace@xxxxxxxxxx> · Thu, 6 Dec 2018 10:36:41 +0100

On Wed, Dec 5, 2018 at 11:53 PM Paul Moore <paul@xxxxxxxxxxxxxx> wrote:
> On Fri, Nov 30, 2018 at 10:24 AM Ondrej Mosnacek <omosnace@xxxxxxxxxx> wrote:
> > Before this patch, during a policy reload the sidtab would become frozen
> > and trying to map a new context to SID would be unable to add a new
> > entry to sidtab and fail with -ENOMEM.
> >
> > Such failures are usually propagated into userspace, which has no way of
> > distignuishing them from actual allocation failures and thus doesn't
> > handle them gracefully. Such situation can be triggered e.g. by the
> > following reproducer:
> >
> >     while true; do load_policy; echo -n .; sleep 0.1; done &
> >     for (( i = 0; i < 1024; i++ )); do
> >         runcon -l s0:c$i echo -n x || break
> >         # or:
> >         # chcon -l s0:c$i <some_file> || break
> >     done
> >
> > This patch overhauls the sidtab so it doesn't need to be frozen during
> > policy reload, thus solving the above problem.
> >
> > The new SID table leverages the fact that SIDs are allocated
> > sequentially and are never invalidated and stores them in linear buckets
> > indexed by a tree structure. This brings several advantages:
> >   1. Fast SID -> context lookup - this lookup can now be done in
> >      logarithmic time complexity (usually in less than 4 array lookups)
> >      and can still be done safely without locking.
> >   2. No need to re-search the whole table on reverse lookup miss - after
> >      acquiring the spinlock only the newly added entries need to be
> >      searched, which means that reverse lookups that end up inserting a
> >      new entry are now about twice as fast.
> >   3. No need to freeze sidtab during policy reload - it is now possible
> >      to handle insertion of new entries even during sidtab conversion.
> >
> > The tree structure of the new sidtab is able to grow automatically to up
> > to about 2^31 entries (at which point it should not have more than about
> > 4 tree levels). The old sidtab had a theoretical capacity of almost 2^32
> > entries, but half of that is still more than enough since by that point
> > the reverse table lookups would become unusably slow anyway...
> >
> > The number of entries per tree node is selected automatically so that
> > each node fits into a single page, which should be the easiest size for
> > kmalloc() to handle.
> >
> > Note that the cache for reverse lookup is preserved with equivalent
> > logic. The only difference is that instead of storing pointers to the
> > hash table nodes it stores just the indices of the cached entries.
> >
> > The new cache ensures that the indices are loaded/stored atomically, but
> > it still has the drawback that concurrent cache updates may mess up the
> > contents of the cache. Such situation however only reduces its
> > effectivity, not the correctness of lookups.
> >
> > Tested by selinux-testsuite and thoroughly tortured by this simple
> > stress test:
> > ```
> > function rand_cat() {
> >         echo $(( $RANDOM % 1024 ))
> > }
> >
> > function do_work() {
> >         while true; do
> >                 echo -n "system_u:system_r:kernel_t:s0:c$(rand_cat),c$(rand_cat)" \
> >                         >/sys/fs/selinux/context 2>/dev/null || true
> >         done
> > }
> >
> > do_work >/dev/null &
> > do_work >/dev/null &
> > do_work >/dev/null &
> >
> > while load_policy; do echo -n .; sleep 0.1; done
> >
> > kill %1
> > kill %2
> > kill %3
> > ```
> >
> > Reported-by: Orion Poplawski <orion@xxxxxxxx>
> > Reported-by: Li Kun <hw.likun@xxxxxxxxxx>
> > Link: https://github.com/SELinuxProject/selinux-kernel/issues/38
> > Signed-off-by: Ondrej Mosnacek <omosnace@xxxxxxxxxx>
> > ---
> >  security/selinux/ss/mls.c      |  23 +-
> >  security/selinux/ss/mls.h      |   3 +-
> >  security/selinux/ss/services.c | 120 +++----
> >  security/selinux/ss/sidtab.c   | 556 ++++++++++++++++++++-------------
> >  security/selinux/ss/sidtab.h   |  80 +++--
> >  5 files changed, 459 insertions(+), 323 deletions(-)
>
> This also looks okay on quick inspection, and once again I know you
> and Stephen have gone over this a lot, so I've merged it into
> selinux/next.  However, I had to basically merge all of sidtab.c by
> hand so please double check it still looks correct to you; I've gone
> over it a few times and it looks like it matches, but it's easy to
> miss something small.

Thank you, I ran a diff with meld between the fixed and original
versions and I can confirm there are only whitespace/comment
differences.

Just one small nit though: I think you used a "bad" format fro the
multiline comment in sidtab_convert(). Or at least Linus seems to hate
it [1] :) OTOH, Documentation/process/coding-style.rst [2] still lists
it as the preferred format for networking code... Not that it would
bother me, but that e-mail has stuck in my mind and now I almost
always notice the comment styles.

[1] https://lkml.org/lkml/2016/7/8/625
[2] https://www.kernel.org/doc/html/v4.19/process/coding-style.html#commenting

>
> Finally, one more reminder to use checkpatch on everything you submit.
> There were a number of errors in this patch too.
>
> [...]

--
Ondrej Mosnacek <omosnace at redhat dot com>
Associate Software Engineer, Security Technologies
Red Hat, Inc.