On Wed, Mar 22, 2023 at 12:39 AM Eric Biggers <ebiggers@xxxxxxxxxx> wrote: > > Hi Carlo, > > On Mon, Jan 23, 2023 at 01:40:46AM +0000, Carlo Marcelo Arenas Belón wrote: > > Since 30b3e9d2 (libselinux: Workaround for heap overhead of pcre, > > 2023-01-12), performance of PCRE2 matches has been affected due to > > excesive recreation of the match_data in an attempt to reduce memory > > utilization; instead of a workaround, it would be better to address > > the problem and maybe even improve performance in the process. > > > > The issue is that currently the structure that holds PCRE state has > > both a pcre2_code (which is per pattern) and a pcre2_match_data (which > > is per match), forcing us to add a mutex to prevent multiple matches to > > step on each other. > > > > Lets remove the match_data and the mutex and instead allocate one once > > in a thread independent way that could be used and reused, by extending > > our pthread interface to not only store TLS variables but also retrieve > > them, and then use one of those. > > > > Since we are not interested on the capture groups (if any) lets only > > allocate 1 pair which is all that will be needed and change the logic > > so that a return of 0 (which means the pattern matched but there were > > not enough capture spots) is also considered a match. > > > > This will ensure that the memory use would be bound to the number of > > concurrent matches instead of the number of patterns and therefore > > reduce the impact that recent changes on the way that the frames used > > for matching are allocated might had brough since 10.41 was released. > > > > For cases where threads are not available, just keep it working in slow > > mode as done before the workaround was reverted. > > > > Signed-off-by: Carlo Marcelo Arenas Belón <carenas@xxxxxxxxx> > > --- > > libselinux/src/regex.c | 108 +++++++++++++++++------------- > > libselinux/src/selinux_internal.h | 4 ++ > > 2 files changed, 64 insertions(+), 48 deletions(-) > > > > Thanks for writing this patch! I notice it hasn't been applied upstream. Is > this still being worked on? > > A couple comments below: > > > +static void __attribute__((destructor)) match_data_thread_free(void *key) > > +{ > > + void *value; > > + pcre2_match_data *match_data; > > + > > + if (match_data_key_initialized <= 0 || !match_data_initialized) > > + return; > > + > > + value = __selinux_getspecific(match_data_key); > > + match_data = value ? value : key; > > + > > + pcre2_match_data_free(match_data); > > + > > + __pthread_mutex_lock(&key_mutex); > > + if (--match_data_key_initialized == 1) { > > + __selinux_key_delete(match_data_key); > > + match_data_key_initialized = -1; > > + } > > + __pthread_mutex_unlock(&key_mutex); > > +} > > This function is used as a pthread_key destructor. But, it's also marked with > __attribute__((destructor)), making it a library-level destructor too. That's > confusing. Is there a reason for that? I would have expected these to be two > different functions. The pthread_key destructor should call > pcre2_match_data_free on the value passed as an argument, while the > library-level destructor should delete match_data_key. > > > int regex_match(struct regex_data *regex, char const *subject, int partial) > > { > > int rc; > > - pcre2_match_data *match_data; > > - __pthread_mutex_lock(®ex->match_mutex); > > + bool slow; > > + pcre2_match_data *match_data = NULL; > > + > > + if (match_data_key_initialized > 0) { > > + if (match_data_initialized == 0) { > > + match_data = pcre2_match_data_create(1, NULL); > > + if (match_data) { > > + match_data_initialized = 1; > > + __selinux_setspecific(match_data_key, > > + match_data); > > + __pthread_mutex_lock(&key_mutex); > > + match_data_key_initialized++; > > + __pthread_mutex_unlock(&key_mutex); > > + } > > + } else > > + match_data = __selinux_getspecific(match_data_key); > > + } > > Since match_data_key_initialized is checked without holding key_mutex, can't the > above code race with the below code in match_data_thread_free? > > __pthread_mutex_lock(&key_mutex); > if (--match_data_key_initialized == 1) { > __selinux_key_delete(match_data_key); > match_data_key_initialized = -1; > } > __pthread_mutex_unlock(&key_mutex); > > Perhaps deleting the pthread_key is just not something that should be done at > all? This patch appears to have been applied to AOSP here: https://android-review.googlesource.com/c/platform/external/selinux/+/2411194 but I didn't see any reply to Eric's comments above. Is this still being worked on?