In my process of using selinux, I found that when semodule -i some.pp loads a large number of modules with rules, the loading time increases rapidly. (none) /tmp # time semodule -i 1.pp real 0m 1.085s user 0m 0.871s sys 0m 0.063s (none) /tmp # time semodule -i 2.pp real 0m 3.178s user 0m 2.890s sys 0m 0.170s (none) /tmp # time semodule -i 3.pp real 0m 13.365s user 0m 12.927s sys 0m 0.340s Then, I analyzed and found a function nodups_specs in label_file.c. The algorithm complexity of implementing this function is O(N^2). In my use scenario, the N number would be over 20,000, so it would be performed hundreds of millions of times. It takes 13s to install the policy. Therefore, I propose an optimization solution to this problem. The purpose of this function is to check for duplicates. I'd like to propose a new implementation that uses hash tables to check for duplicates. This patch set consists of two parts to implement this solution. - Migrate the existing hashtab templates in policycoreutils/newrole to libselinux. - Implement the hash function and key comparison function required by the hashtab template. Improve algorithm for checking duplicate items of nodups_specs function. Test again with the new implementation. (none) /tmp # time semodule -i 1.pp real 0m 0.725s user 0m 0.483s sys 0m 0.100s (none) /tmp # time semodule -i 2.pp real 0m 0.939s user 0m 0.597s sys 0m 0.228s (none) /tmp # time semodule -i 3.pp real 0m 1.885s user 0m 1.473s sys 0m 0.300s wanghuizhao (2): libselinux: migrating hashtab from policycoreutils libselinux: performance optimization for duplicate detection libselinux/src/hashtab.c | 208 ++++++++++++++++++++++++++++++++++++++++++++ libselinux/src/hashtab.h | 115 ++++++++++++++++++++++++ libselinux/src/label_file.c | 112 +++++++++++++++++++----- libselinux/src/label_file.h | 5 ++ 4 files changed, 416 insertions(+), 24 deletions(-) create mode 100644 libselinux/src/hashtab.c create mode 100644 libselinux/src/hashtab.h -- 2.12.3