Re: Making the tests ~2.5x faster (was: [PATCH v3] test-lib.sh: Use GLIBC_TUNABLES instead of MALLOC_CHECK_ on glibc >= 2.34)

Phillip Wood <phillip.wood123@xxxxxxxxx> · Tue, 5 Apr 2022 14:36:33 +0100

On 05/04/2022 11:03, Ævar Arnfjörð Bjarmason wrote:

On Mon, Apr 04 2022, Phillip Wood wrote:

On 04/03/2022 13:37, Elia Pinto wrote:
In glibc >= 2.34 MALLOC_CHECK_ and MALLOC_PERTURB_ environment
variables have been replaced by GLIBC_TUNABLES.  Also the new
glibc requires that you preload a library called libc_malloc_debug.so
to get these features.
Using the ordinary glibc system variable detect if this is glibc >=
2.34 and
use GLIBC_TUNABLES and the new library.
This patch was inspired by a Richard W.M. Jones ndbkit patch
Helped-by: Junio C Hamano <gitster@xxxxxxxxx>
Signed-off-by: Elia Pinto <gitter.spiros@xxxxxxxxx>
---
This is the third version of the patch.
Compared to the second version[1], the code is further simplified,
eliminating a case statement and modifying a string statement.
[1] https://www.spinics.net/lists/git/msg433917.html
   t/test-lib.sh | 16 ++++++++++++++++
   1 file changed, 16 insertions(+)

diff --git a/t/test-lib.sh b/t/test-lib.sh
index 9af5fb7674..4d10646015 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -550,9 +550,25 @@ else
   	setup_malloc_check () {
   		MALLOC_CHECK_=3	MALLOC_PERTURB_=165
   		export MALLOC_CHECK_ MALLOC_PERTURB_
+		if _GLIBC_VERSION=$(getconf GNU_LIBC_VERSION 2>/dev/null) &&
+		_GLIBC_VERSION=${_GLIBC_VERSION#"glibc "} &&
+		expr 2.34 \<= "$_GLIBC_VERSION" >/dev/null
+		then
+			g=
+			LD_PRELOAD="libc_malloc_debug.so.0"

When compiling with "SANITIZE = address,leak" this use of LD_PRELOAD
makes the tests fail with

==9750==ASan runtime does not come first in initial library list; you
should either link runtime to your application or manually preload it
with LD_PRELOAD.

because libc_malloc_debug.so is being loaded before libasan.so. If I
set TEST_NO_MALLOC_CHECK=1 when I run the tests then ASAN does not
complain but it would be nicer if I did not have to do that. I'm
confused as to why the CI leak tests are running fine - am I missing
something with my setup?

Perhaps they have an older glibc? They're on Ubunt, and e.g. my Debian
version is on 2.33.

Good point, I'd not realized quite how new glibc 2.34 was

But more generally, I'd somehow managed to not notice for all my time in
hacking on git (including on SANITIZE=leak, another tracing mode!) that
this check was being enabled *by default*, which could have saved me
some time waiting for tests...:
	
	$ git hyperfine -L rev HEAD~0 -L off yes, -s 'make CFLAGS=-O3' '(cd t && TEST_NO_MALLOC_CHECK={off} ./t3070-wildmatch.sh)' --warmup 1 -r 3
	Benchmark 1: (cd t && TEST_NO_MALLOC_CHECK=yes ./t3070-wildmatch.sh)' in 'HEAD~0
	  Time (mean ± σ):      4.191 s ±  0.012 s    [User: 3.600 s, System: 0.746 s]
	  Range (min … max):    4.181 s …  4.204 s    3 runs
	
	Benchmark 2: (cd t && TEST_NO_MALLOC_CHECK= ./t3070-wildmatch.sh)' in 'HEAD~0
	  Time (mean ± σ):      5.945 s ±  0.101 s    [User: 4.989 s, System: 1.146 s]
	  Range (min … max):    5.878 s …  6.062 s    3 runs
	
	Summary
	  '(cd t && TEST_NO_MALLOC_CHECK=yes ./t3070-wildmatch.sh)' in 'HEAD~0' ran
	    1.42 ± 0.02 times faster than '(cd t && TEST_NO_MALLOC_CHECK= ./t3070-wildmatch.sh)' in 'HEAD~0'

I.e. I get that it's catching actual issues, but I was also doing runs
with SANITIZE=address, which I believe are going to catch a superset of
issues that this check does, so...

I assumed SANITIZE=address would catch a superset of issues as well but 
I haven't actually checked the glibc tunables documentation. We disable 
MALLOC_PERTURB_ when running under valgrind so perhaps we should do the 
same when compiling with SANITIZE=address.

I just noticed that setup_malloc_check() is called by 
test_expect_success() and test_when_finished() so it really should be 
caching the result of the check rather than forking getconf and expr 
each time it is called. Overwriting LD_PRELOAD is not very friendly 
either, it would be better if it appended the debug library if the 
variable is already set.

Whatever we do with this narrow patch it would be a really nice
improvement if the test-lib.sh could fold all of these
"instrumentations" behind a single flag, and that both it and "make
test" would make it clear that you're testing in a slower "tracing" or
"instrumentation" mode.

Ditto things like chain lint and the bin-wrappers, e.g.:

I sometimes wish there was a way to only chain lint the tests that have 
changed since the last run.

     $ git hyperfine -L rev HEAD~0 -L off yes, -L cl 0,1 -L nbw --no-bin-wrappers, -s 'make CFLAGS=-O3' '(cd t && GIT_TEST_CHAIN_LINT={cl} TEST_NO_MALLOC_CHECK={off} ./t3070-wildmatch.sh {nbw})' -r 1
     [...]	
	Summary
	  '(cd t && GIT_TEST_CHAIN_LINT=0 TEST_NO_MALLOC_CHECK=yes ./t3070-wildmatch.sh --no-bin-wrappers)' in 'HEAD~0' ran
	    1.23 times faster than '(cd t && GIT_TEST_CHAIN_LINT=0 TEST_NO_MALLOC_CHECK=yes ./t3070-wildmatch.sh )' in 'HEAD~0'
	    1.30 times faster than '(cd t && GIT_TEST_CHAIN_LINT=1 TEST_NO_MALLOC_CHECK=yes ./t3070-wildmatch.sh --no-bin-wrappers)' in 'HEAD~0'
	    1.54 times faster than '(cd t && GIT_TEST_CHAIN_LINT=1 TEST_NO_MALLOC_CHECK=yes ./t3070-wildmatch.sh )' in 'HEAD~0'
	    1.63 times faster than '(cd t && GIT_TEST_CHAIN_LINT=0 TEST_NO_MALLOC_CHECK= ./t3070-wildmatch.sh --no-bin-wrappers)' in 'HEAD~0'
	    1.87 times faster than '(cd t && GIT_TEST_CHAIN_LINT=0 TEST_NO_MALLOC_CHECK= ./t3070-wildmatch.sh )' in 'HEAD~0'
	    1.92 times faster than '(cd t && GIT_TEST_CHAIN_LINT=1 TEST_NO_MALLOC_CHECK= ./t3070-wildmatch.sh --no-bin-wrappers)' in 'HEAD~0'
	    2.24 times faster than '(cd t && GIT_TEST_CHAIN_LINT=1 TEST_NO_MALLOC_CHECK= ./t3070-wildmatch.sh )' in 'HEAD~0'

I.e. between this, chain lint and bin wrappers we're coming up on our
tests running almost 3x as slow as they otherwise could *by default*.

But right now knowing which things you need to chase around to turn off
if you're just looking to test the semantics of your code without all
this instrumentation is a matter of archane knowledge, I'm not even sure
I remembered all the major ones (I didn't know about this one until
today).

That is quite a difference in run time - I wonder how much scope there 
is for optimizing some of these features like the chain-lint vs 
disabling them completely.

Best Wishes

Phillip