Re: [PATCH] libselinux: Workaround for heap overhead of pcre

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 19 Dec 2022 at 09:53, Inseob Kim <inseob@xxxxxxxxxx> wrote:
>
> pcre's behavior is changed so that pcre2_match always allocates heap for
> match_data, rather than stack, regardless of size. The heap isn't freed
> until explicitly calling pcre2_match_data_free. This new behavior may
> result in heap overhead, which may increase the peak memory usage about
> a few megabytes. It's because regex_match is first called for regex_data
> objects, and then regex_data objects are freed at once.

This approach trades peak memory usage for temporary allocations,
which effects runtime performance.  On modern systems memory is most
of the time not a scarce resource.

Some examples:

# selabel_lookup -b file -k /etc/shadow -t file [heaptrack]

## current

total runtime: 0.07s.
calls to allocation functions: 28420 (406000/s)
temporary memory allocations: 16 (228/s)
peak heap memory consumption: 10.09M
peak RSS (including heaptrack overhead): 21.27M
total memory leaked: 1.02K

## proposed

total runtime: 0.06s.
calls to allocation functions: 23430 (366093/s)
temporary memory allocations: 675 (10546/s)
peak heap memory consumption: 9.48M
peak RSS (including heaptrack overhead): 18.59M
total memory leaked: 1.02K

# restorecon -vRn /etc [heaptrack]

## current

total runtime: 0.14s.
calls to allocation functions: 33873 (236874/s)
temporary memory allocations: 1877 (13125/s)
peak heap memory consumption: 10.09M
peak RSS (including heaptrack overhead): 21.58M
total memory leaked: 1.90K

## proposed

total runtime: 0.27s.
calls to allocation functions: 378762 (1423917/s)
temporary memory allocations: 351487 (1321379/s)
peak heap memory consumption: 9.48M
peak RSS (including heaptrack overhead): 20.99M
total memory leaked: 1.90K


# restorecon -vRn /usr [hyperfine]

## current

restorecon -vRn /usr
Benchmark 1: ~/destdir/sbin/restorecon -vRn /usr
  Time (mean ± σ):     24.419 s ±  0.661 s    [User: 23.480 s, System: 0.922 s]
  Range (min … max):   23.399 s … 25.495 s    10 runs

## proposed

restorecon -vRn /usr
Benchmark 1: ~/destdir/sbin/restorecon -vRn /usr
  Time (mean ± σ):     28.628 s ±  0.968 s    [User: 27.688 s, System: 0.927 s]
  Range (min … max):   27.674 s … 30.798 s    10 runs


So I would argue the performance impact for applications (like
setfiles, restorecon) or daemon (like systemd, udev) is more critical
than the 500K per application.

> To workaround it, free and reallocate match_data whenever we call
> regex_match. It's fine because libselinux currently doesn't use
> match_data, but use only the return value.
>
> Signed-off-by: Inseob Kim <inseob@xxxxxxxxxx>
> ---
>  libselinux/src/regex.c | 10 ++++++++++
>  1 file changed, 10 insertions(+)
>
> diff --git a/libselinux/src/regex.c b/libselinux/src/regex.c
> index 149a7973..2df282f1 100644
> --- a/libselinux/src/regex.c
> +++ b/libselinux/src/regex.c
> @@ -213,10 +213,20 @@ void regex_data_free(struct regex_data *regex)
>  int regex_match(struct regex_data *regex, char const *subject, int partial)
>  {
>         int rc;
> +       pcre2_match_data *new_match_data;
>         __pthread_mutex_lock(&regex->match_mutex);
> +       new_match_data = pcre2_match_data_create_from_pattern(
> +           regex->regex, NULL);

Should be checked for failure (cause pcre2_match() expects a non-NULL
match_data, which would be passed the second time).

Also with this change the member match_data of the struct regex_data
becomes obsolete and should be removed.

>         rc = pcre2_match(
>             regex->regex, (PCRE2_SPTR)subject, PCRE2_ZERO_TERMINATED, 0,
>             partial ? PCRE2_PARTIAL_SOFT : 0, regex->match_data, NULL);
> +       // pcre2_match allocates heap and it won't be freed until
> +       // pcre2_match_data_free, resulting in heap overhead.
> +       // Reallocate match_data to prevent such overhead, whenever possible.
> +       if (new_match_data) {
> +               pcre2_match_data_free(regex->match_data);
> +               regex->match_data = new_match_data;
> +       }
>         __pthread_mutex_unlock(&regex->match_mutex);
>         if (rc > 0)
>                 return REGEX_MATCH;
> --
> 2.39.0.314.g84b9a713c41-goog
>




[Index of Archives]     [Selinux Refpolicy]     [Linux SGX]     [Fedora Users]     [Fedora Desktop]     [Yosemite Photos]     [Yosemite Camping]     [Yosemite Campsites]     [KDE Users]     [Gnome Users]

  Powered by Linux