[PATCH 0/2] Improve efficiency of detecting duplicate in libselinux

wanghuizhao <wanghuizhao1@xxxxxxxxxx> · Thu, 9 Feb 2023 19:42:51 +0800

In my process of using selinux, I found that when semodule -i some.pp loads
a large number of modules with rules, the loading time increases rapidly.

(none) /tmp # time semodule -i 1.pp
real    0m 1.085s
user    0m 0.871s
sys     0m 0.063s
(none) /tmp # time semodule -i 2.pp
real    0m 3.178s
user    0m 2.890s
sys     0m 0.170s
(none) /tmp # time semodule -i 3.pp
real    0m 13.365s
user    0m 12.927s
sys     0m 0.340s

Then, I analyzed and found a function nodups_specs in label_file.c. The
algorithm complexity of implementing this function is O(N^2). In my use
scenario, the N number would be over 20,000, so it would be performed
hundreds of millions of times. It takes 13s to install the policy.

Therefore, I propose an optimization solution to this problem. The purpose
of this function is to check for duplicates. I'd like to propose a new
implementation that uses hash tables to check for duplicates.

This patch set consists of two parts to implement this solution.

- Migrate the existing hashtab templates in policycoreutils/newrole to
  libselinux.

- Implement the hash function and key comparison function required by the
  hashtab template. Improve algorithm for checking duplicate items of
  nodups_specs function.

Test again with the new implementation.

(none) /tmp # time semodule -i 1.pp
real    0m 0.725s
user    0m 0.483s
sys     0m 0.100s
(none) /tmp # time semodule -i 2.pp
real    0m 0.939s
user    0m 0.597s
sys     0m 0.228s
(none) /tmp # time semodule -i 3.pp
real    0m 1.885s
user    0m 1.473s
sys     0m 0.300s

wanghuizhao (2):
  libselinux: migrating hashtab from policycoreutils
  libselinux: performance optimization for duplicate detection

 libselinux/src/hashtab.c    | 208 ++++++++++++++++++++++++++++++++++++++++++++
 libselinux/src/hashtab.h    | 115 ++++++++++++++++++++++++
 libselinux/src/label_file.c | 112 +++++++++++++++++++-----
 libselinux/src/label_file.h |   5 ++
 4 files changed, 416 insertions(+), 24 deletions(-)
 create mode 100644 libselinux/src/hashtab.c
 create mode 100644 libselinux/src/hashtab.h

-- 
2.12.3