Re: Making the tests ~2.5x faster (was: [PATCH v3] test-lib.sh: Use GLIBC_TUNABLES instead of MALLOC_CHECK_ on glibc >= 2.34)

Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> · Tue, 05 Apr 2022 21:59:53 +0200

On Tue, Apr 05 2022, Phillip Wood wrote:

> On 05/04/2022 11:03, Ævar Arnfjörð Bjarmason wrote:
>> On Mon, Apr 04 2022, Phillip Wood wrote:
>> 
>>> On 04/03/2022 13:37, Elia Pinto wrote:
>>>> In glibc >= 2.34 MALLOC_CHECK_ and MALLOC_PERTURB_ environment
>>>> variables have been replaced by GLIBC_TUNABLES.  Also the new
>>>> glibc requires that you preload a library called libc_malloc_debug.so
>>>> to get these features.
>>>> Using the ordinary glibc system variable detect if this is glibc >=
>>>> 2.34 and
>>>> use GLIBC_TUNABLES and the new library.
>>>> This patch was inspired by a Richard W.M. Jones ndbkit patch
>>>> Helped-by: Junio C Hamano <gitster@xxxxxxxxx>
>>>> Signed-off-by: Elia Pinto <gitter.spiros@xxxxxxxxx>
>>>> ---
>>>> This is the third version of the patch.
>>>> Compared to the second version[1], the code is further simplified,
>>>> eliminating a case statement and modifying a string statement.
>>>> [1] https://www.spinics.net/lists/git/msg433917.html
>>>>    t/test-lib.sh | 16 ++++++++++++++++
>>>>    1 file changed, 16 insertions(+)
>>>> diff --git a/t/test-lib.sh b/t/test-lib.sh
>>>> index 9af5fb7674..4d10646015 100644
>>>> --- a/t/test-lib.sh
>>>> +++ b/t/test-lib.sh
>>>> @@ -550,9 +550,25 @@ else
>>>>    	setup_malloc_check () {
>>>>    		MALLOC_CHECK_=3	MALLOC_PERTURB_=165
>>>>    		export MALLOC_CHECK_ MALLOC_PERTURB_
>>>> +		if _GLIBC_VERSION=$(getconf GNU_LIBC_VERSION 2>/dev/null) &&
>>>> +		_GLIBC_VERSION=${_GLIBC_VERSION#"glibc "} &&
>>>> +		expr 2.34 \<= "$_GLIBC_VERSION" >/dev/null
>>>> +		then
>>>> +			g=
>>>> +			LD_PRELOAD="libc_malloc_debug.so.0"
>>>
>>> When compiling with "SANITIZE = address,leak" this use of LD_PRELOAD
>>> makes the tests fail with
>>>
>>> ==9750==ASan runtime does not come first in initial library list; you
>>> should either link runtime to your application or manually preload it
>>> with LD_PRELOAD.
>>>
>>> because libc_malloc_debug.so is being loaded before libasan.so. If I
>>> set TEST_NO_MALLOC_CHECK=1 when I run the tests then ASAN does not
>>> complain but it would be nicer if I did not have to do that. I'm
>>> confused as to why the CI leak tests are running fine - am I missing
>>> something with my setup?
>> Perhaps they have an older glibc? They're on Ubunt, and e.g. my
>> Debian
>> version is on 2.33.
>
> Good point, I'd not realized quite how new glibc 2.34 was
>
>> But more generally, I'd somehow managed to not notice for all my time in
>> hacking on git (including on SANITIZE=leak, another tracing mode!) that
>> this check was being enabled *by default*, which could have saved me
>> some time waiting for tests...:
>> 	
>> 	$ git hyperfine -L rev HEAD~0 -L off yes, -s 'make CFLAGS=-O3' '(cd t && TEST_NO_MALLOC_CHECK={off} ./t3070-wildmatch.sh)' --warmup 1 -r 3
>> 	Benchmark 1: (cd t && TEST_NO_MALLOC_CHECK=yes ./t3070-wildmatch.sh)' in 'HEAD~0
>> 	  Time (mean ± σ):      4.191 s ±  0.012 s    [User: 3.600 s, System: 0.746 s]
>> 	  Range (min … max):    4.181 s …  4.204 s    3 runs
>> 	
>> 	Benchmark 2: (cd t && TEST_NO_MALLOC_CHECK= ./t3070-wildmatch.sh)' in 'HEAD~0
>> 	  Time (mean ± σ):      5.945 s ±  0.101 s    [User: 4.989 s, System: 1.146 s]
>> 	  Range (min … max):    5.878 s …  6.062 s    3 runs
>> 	
>> 	Summary
>> 	  '(cd t && TEST_NO_MALLOC_CHECK=yes ./t3070-wildmatch.sh)' in 'HEAD~0' ran
>> 	    1.42 ± 0.02 times faster than '(cd t && TEST_NO_MALLOC_CHECK= ./t3070-wildmatch.sh)' in 'HEAD~0'
>> I.e. I get that it's catching actual issues, but I was also doing
>> runs
>> with SANITIZE=address, which I believe are going to catch a superset of
>> issues that this check does, so...
>
> I assumed SANITIZE=address would catch a superset of issues as well
> but I haven't actually checked the glibc tunables documentation. We
> disable MALLOC_PERTURB_ when running under valgrind so perhaps we
> should do the same when compiling with SANITIZE=address.

I'm not sure either, but given how exhaustive SANITIZE=address is I'd be
surprised if not.

> I just noticed that setup_malloc_check() is called by
> test_expect_success() and test_when_finished() so it really should be 
> caching the result of the check rather than forking getconf and expr
> each time it is called. Overwriting LD_PRELOAD is not very friendly 
> either, it would be better if it appended the debug library if the
> variable is already set.

We really should just be checking this when building, we even have C
code already that detects glibc and its version, I have some local (but
semi-unrelated) patches. Anyway...

>> Whatever we do with this narrow patch it would be a really nice
>> improvement if the test-lib.sh could fold all of these
>> "instrumentations" behind a single flag, and that both it and "make
>> test" would make it clear that you're testing in a slower "tracing" or
>> "instrumentation" mode.
>> Ditto things like chain lint and the bin-wrappers, e.g.:
>
> I sometimes wish there was a way to only chain lint the tests that
> have changed since the last run.

Mm, perhaps some make-based solution... :)

I had some experiments to go even further, and have "make test" only run
the tests relevant to the code that just changed, which with trace2's
filenames and GCC/clang's -MF option you can bridge that gap.

>>      $ git hyperfine -L rev HEAD~0 -L off yes, -L cl 0,1 -L nbw --no-bin-wrappers, -s 'make CFLAGS=-O3' '(cd t && GIT_TEST_CHAIN_LINT={cl} TEST_NO_MALLOC_CHECK={off} ./t3070-wildmatch.sh {nbw})' -r 1
>>      [...]	
>> 	Summary
>> 	  '(cd t && GIT_TEST_CHAIN_LINT=0 TEST_NO_MALLOC_CHECK=yes ./t3070-wildmatch.sh --no-bin-wrappers)' in 'HEAD~0' ran
>> 	    1.23 times faster than '(cd t && GIT_TEST_CHAIN_LINT=0 TEST_NO_MALLOC_CHECK=yes ./t3070-wildmatch.sh )' in 'HEAD~0'
>> 	    1.30 times faster than '(cd t && GIT_TEST_CHAIN_LINT=1 TEST_NO_MALLOC_CHECK=yes ./t3070-wildmatch.sh --no-bin-wrappers)' in 'HEAD~0'
>> 	    1.54 times faster than '(cd t && GIT_TEST_CHAIN_LINT=1 TEST_NO_MALLOC_CHECK=yes ./t3070-wildmatch.sh )' in 'HEAD~0'
>> 	    1.63 times faster than '(cd t && GIT_TEST_CHAIN_LINT=0 TEST_NO_MALLOC_CHECK= ./t3070-wildmatch.sh --no-bin-wrappers)' in 'HEAD~0'
>> 	    1.87 times faster than '(cd t && GIT_TEST_CHAIN_LINT=0 TEST_NO_MALLOC_CHECK= ./t3070-wildmatch.sh )' in 'HEAD~0'
>> 	    1.92 times faster than '(cd t && GIT_TEST_CHAIN_LINT=1 TEST_NO_MALLOC_CHECK= ./t3070-wildmatch.sh --no-bin-wrappers)' in 'HEAD~0'
>> 	    2.24 times faster than '(cd t && GIT_TEST_CHAIN_LINT=1 TEST_NO_MALLOC_CHECK= ./t3070-wildmatch.sh )' in 'HEAD~0'
>> I.e. between this, chain lint and bin wrappers we're coming up on
>> our
>> tests running almost 3x as slow as they otherwise could *by default*.
>> But right now knowing which things you need to chase around to turn
>> off
>> if you're just looking to test the semantics of your code without all
>> this instrumentation is a matter of archane knowledge, I'm not even sure
>> I remembered all the major ones (I didn't know about this one until
>> today).
>
> That is quite a difference in run time - I wonder how much scope there
> is for optimizing some of these features like the chain-lint vs 
> disabling them completely.

Per my series at
https://lore.kernel.org/git/cover-v2-00.25-00000000000-20220325T182534Z-avarab@xxxxxxxxx/
I'd much rather see us go in the direction of mainly piggy-backing on CI
for such extended testing, and just having easy to use targets for "do
exhaustive tests please".

E.g. so you could run "make test-like-ci" or whatever, and it would do
all the N permutations we do in CI locally.