Patch 23 does not seem to have made it to the list. Jim On Fri, Aug 11, 2023 at 12:45 PM Christian Göttsche <cgzones@xxxxxxxxxxxxxx> wrote: > > Currently the database for file backend of selabel stores the file > context specifications in a single long array. This array is sorted by > special precedence rules, e.g. regular expressions without meta > character first, ordered by length, and the remaining regular > expressions ordered by stem (the prefix part of the regular expressions > without meta characters) length. > > This results in suboptimal lookup performance for two reasons; > File context specifications without any meta characters (e.g. > '/etc/passwd') are still matched via an expensive regular expression > match operation. > All such trivial regular expressions are matched against before any non- > trivial regular expression, resulting in thousands of regex match > operations for lookups for paths not matching any of the trivial ones. > > Rework the internal representation of the database in two ways: > Convert regular expressions without any meta characters and containing > only supported escaped characters (e.g. '/etc/rc\.d/init\.d') into > literal strings, which get compared via strcmp(3) later on. > Store the specifications in a tree structure (since the filesystem is a > tree) to reduce the to number of specifications that need to be checked. > > Since the internal representation is completely rewritten introduce a > new compiled file context file format mirroring the tree structure. > The new format also stores all multi-byte data in network byte-order, so > that such compiled files can be cross-compiled, e.g. for embedded > devices with read-only filesystems (except for the regular expressions, > which are still architecture-dependent). > > The improved lookup performance will also benefit SELinux aware daemons, > which create files with their default context, e.g. systemd. > > # Performance data > > ## Compiled file context sizes > > Fedora 38 (regular expressions are omitted on Fedora): > file_contexts.bin: 596783 -> 575284 (bytes) > file_contexts.homedirs.bin: 21219 -> 18185 (bytes) > > Debian Sid (regular expressions are included): > file_contexts.bin: 2580704 -> 1428354 (bytes) > file_contexts.homedirs.bin: 130946 -> 96884 (bytes) > > ## Single lookup > > (selabel -b file -k /bin/bash) > > Fedora 38 in VM: > text: time: 3.6 ms -> 4.7 ms > peak heap: 2.32M -> 1.44M > peak rss: 5.61M -> 6.03M > compiled: time: 1.5 ms -> 1.5 ms > peak heap: 2.14M -> 917.93K > peak rss: 5.33M -> 5.47M > > Debian Sid on Raspberry Pi 3: > text: time: 33.9 ms -> 19.9 ms > peak heap: 10.46M -> 468.72K > peak rss: 9.44M -> 4.98M > compiled: time: 39.3 ms -> 22.8 ms > peak heap: 13.09M -> 1.86M > peak rss: 12.57M -> 7.86M > > ## Full filesystem relabel > > (restorecon -vRn /) > > Fedora 38 in VM: > 27.445 s -> 3.293 s > Debian Sid on Raspberry Pi 3: > 86.734 s -> 10.810 s > > (restorecon -vRn -T0 /) > > Fedora 38 in VM (8 cores): > 29.205 s -> 2.521 s > Debian Sid on Raspberry Pi 3 (4 cores): > 46.974 s -> 10.728 s > > (note: I am unsure why the parallel runs on Fedora are slower) > > # TODO > > There might be subtle differences in lookup results which evaded my > testing, because some precedence rules are oblique. For example > `/usr/(.*/)?lib(/.*)?` has to have a higher precedence than > `/usr/(.*/)?bin(/.*)?` to match the current Fedora behavior. Please > report any behavior changes. > > If any code section is unclear I am happy to add some inline comments. > > The maximum node depth in the database is set to 3, which seems to give > the best performance to memory usage ratio. Might be tweaked for > systems with different filesystem hierarchies (Android?). > > I am not that familiar with the selabel_partial_match(3), > selabel_get_digests_all_partial_matches(3) and > selabel_hash_all_partial_matches(3) related interfaces, so I only did > some rudimentary tests for them. > > > # Patches > > Patches 1-4 have been proposed already: > https://patchwork.kernel.org/project/selinux/list/?series=772728 > > Patch 5 has been proposed already: > https://patchwork.kernel.org/project/selinux/patch/20230803162301.302579-1-cgzones@xxxxxxxxxxxxxx/ > > Patches 6-22 are cleanup and misc fixes which can be applied own their own. > > Patch 23 is the rework > > Patch 24 is removing unused code after the rework in patch 23 > > This patchset is also available at https://github.com/SELinuxProject/selinux/pull/406 > > > Christian Göttsche (24): > libselinux/utils: update selabel_partial_match > libselinux: misc label cleanup > libselinux: drop obsolete optimization flag > libselinux: drop unnecessary warning overrides > setfiles: do not issue AUDIT_FS_RELABEL on dry run > libselinux: cast to unsigned char for character handling function > libselinux: constify selabel_cmp(3) parameters > libselinux: introduce reallocarray(3) > libselinux: simplify zeroing allocation > libselinux: introduce selabel_nuke > libselinux/utils: use type safe union assignment > libselinux: avoid regex serialization truncations > libselinux/utils: introduce selabel_compare > libselinux: parameter simplifications > libselinux/utils: use correct type for backend argument > libselinux: update string_to_mode() > libselinux: remove SELABEL_OPT_SUBSET support from selabel_file(5) > libselinux: fix logic for building android backend > libselinux: avoid unused function > libselinux: check for stream rewind failures > libselinux: simplify internal selabel_validate prototype > libselinux/utils: drop include of internal header file > libselinux: rework selabel_file(5) database > libselinux: remove unused hashtab code > > libselinux/include/selinux/label.h | 6 +- > libselinux/include/selinux/selinux.h | 6 +- > libselinux/src/Makefile | 20 +- > libselinux/src/booleans.c | 8 +- > libselinux/src/compute_create.c | 2 +- > libselinux/src/get_context_list.c | 14 +- > libselinux/src/get_default_type.c | 2 +- > libselinux/src/hashtab.c | 234 -- > libselinux/src/hashtab.h | 117 - > libselinux/src/is_customizable_type.c | 7 +- > libselinux/src/label.c | 40 +- > libselinux/src/label_backends_android.c | 9 +- > libselinux/src/label_file.c | 2107 +++++++++++------ > libselinux/src/label_file.h | 893 ++++--- > libselinux/src/label_internal.h | 17 +- > libselinux/src/label_media.c | 7 +- > libselinux/src/label_support.c | 43 +- > libselinux/src/label_x.c | 7 +- > libselinux/src/load_policy.c | 2 +- > libselinux/src/matchmediacon.c | 6 +- > libselinux/src/matchpathcon.c | 17 +- > libselinux/src/regex.c | 57 +- > .../src/selinux_check_securetty_context.c | 4 +- > libselinux/src/selinux_config.c | 12 +- > libselinux/src/selinux_internal.c | 16 + > libselinux/src/selinux_internal.h | 4 + > libselinux/src/selinux_restorecon.c | 3 +- > libselinux/src/seusers.c | 6 +- > libselinux/utils/.gitignore | 2 + > libselinux/utils/matchpathcon.c | 11 +- > libselinux/utils/sefcontext_compile.c | 536 +++-- > libselinux/utils/selabel_compare.c | 119 + > libselinux/utils/selabel_digest.c | 3 +- > .../selabel_get_digests_all_partial_matches.c | 2 - > libselinux/utils/selabel_lookup.c | 3 +- > libselinux/utils/selabel_nuke.c | 134 ++ > libselinux/utils/selabel_partial_match.c | 7 +- > libselinux/utils/selinux_check_access.c | 2 +- > policycoreutils/setfiles/setfiles.c | 16 +- > 39 files changed, 2854 insertions(+), 1647 deletions(-) > delete mode 100644 libselinux/src/hashtab.c > delete mode 100644 libselinux/src/hashtab.h > create mode 100644 libselinux/utils/selabel_compare.c > create mode 100644 libselinux/utils/selabel_nuke.c > > -- > 2.40.1 >