On Tue, Apr 05 2022, Phillip Wood wrote: > On 05/04/2022 11:03, Ævar Arnfjörð Bjarmason wrote: >> On Mon, Apr 04 2022, Phillip Wood wrote: >> >>> On 04/03/2022 13:37, Elia Pinto wrote: >>>> In glibc >= 2.34 MALLOC_CHECK_ and MALLOC_PERTURB_ environment >>>> variables have been replaced by GLIBC_TUNABLES. Also the new >>>> glibc requires that you preload a library called libc_malloc_debug.so >>>> to get these features. >>>> Using the ordinary glibc system variable detect if this is glibc >= >>>> 2.34 and >>>> use GLIBC_TUNABLES and the new library. >>>> This patch was inspired by a Richard W.M. Jones ndbkit patch >>>> Helped-by: Junio C Hamano <gitster@xxxxxxxxx> >>>> Signed-off-by: Elia Pinto <gitter.spiros@xxxxxxxxx> >>>> --- >>>> This is the third version of the patch. >>>> Compared to the second version[1], the code is further simplified, >>>> eliminating a case statement and modifying a string statement. >>>> [1] https://www.spinics.net/lists/git/msg433917.html >>>> t/test-lib.sh | 16 ++++++++++++++++ >>>> 1 file changed, 16 insertions(+) >>>> diff --git a/t/test-lib.sh b/t/test-lib.sh >>>> index 9af5fb7674..4d10646015 100644 >>>> --- a/t/test-lib.sh >>>> +++ b/t/test-lib.sh >>>> @@ -550,9 +550,25 @@ else >>>> setup_malloc_check () { >>>> MALLOC_CHECK_=3 MALLOC_PERTURB_=165 >>>> export MALLOC_CHECK_ MALLOC_PERTURB_ >>>> + if _GLIBC_VERSION=$(getconf GNU_LIBC_VERSION 2>/dev/null) && >>>> + _GLIBC_VERSION=${_GLIBC_VERSION#"glibc "} && >>>> + expr 2.34 \<= "$_GLIBC_VERSION" >/dev/null >>>> + then >>>> + g= >>>> + LD_PRELOAD="libc_malloc_debug.so.0" >>> >>> When compiling with "SANITIZE = address,leak" this use of LD_PRELOAD >>> makes the tests fail with >>> >>> ==9750==ASan runtime does not come first in initial library list; you >>> should either link runtime to your application or manually preload it >>> with LD_PRELOAD. >>> >>> because libc_malloc_debug.so is being loaded before libasan.so. If I >>> set TEST_NO_MALLOC_CHECK=1 when I run the tests then ASAN does not >>> complain but it would be nicer if I did not have to do that. I'm >>> confused as to why the CI leak tests are running fine - am I missing >>> something with my setup? >> Perhaps they have an older glibc? They're on Ubunt, and e.g. my >> Debian >> version is on 2.33. > > Good point, I'd not realized quite how new glibc 2.34 was > >> But more generally, I'd somehow managed to not notice for all my time in >> hacking on git (including on SANITIZE=leak, another tracing mode!) that >> this check was being enabled *by default*, which could have saved me >> some time waiting for tests...: >> >> $ git hyperfine -L rev HEAD~0 -L off yes, -s 'make CFLAGS=-O3' '(cd t && TEST_NO_MALLOC_CHECK={off} ./t3070-wildmatch.sh)' --warmup 1 -r 3 >> Benchmark 1: (cd t && TEST_NO_MALLOC_CHECK=yes ./t3070-wildmatch.sh)' in 'HEAD~0 >> Time (mean ± σ): 4.191 s ± 0.012 s [User: 3.600 s, System: 0.746 s] >> Range (min … max): 4.181 s … 4.204 s 3 runs >> >> Benchmark 2: (cd t && TEST_NO_MALLOC_CHECK= ./t3070-wildmatch.sh)' in 'HEAD~0 >> Time (mean ± σ): 5.945 s ± 0.101 s [User: 4.989 s, System: 1.146 s] >> Range (min … max): 5.878 s … 6.062 s 3 runs >> >> Summary >> '(cd t && TEST_NO_MALLOC_CHECK=yes ./t3070-wildmatch.sh)' in 'HEAD~0' ran >> 1.42 ± 0.02 times faster than '(cd t && TEST_NO_MALLOC_CHECK= ./t3070-wildmatch.sh)' in 'HEAD~0' >> I.e. I get that it's catching actual issues, but I was also doing >> runs >> with SANITIZE=address, which I believe are going to catch a superset of >> issues that this check does, so... > > I assumed SANITIZE=address would catch a superset of issues as well > but I haven't actually checked the glibc tunables documentation. We > disable MALLOC_PERTURB_ when running under valgrind so perhaps we > should do the same when compiling with SANITIZE=address. I'm not sure either, but given how exhaustive SANITIZE=address is I'd be surprised if not. > I just noticed that setup_malloc_check() is called by > test_expect_success() and test_when_finished() so it really should be > caching the result of the check rather than forking getconf and expr > each time it is called. Overwriting LD_PRELOAD is not very friendly > either, it would be better if it appended the debug library if the > variable is already set. We really should just be checking this when building, we even have C code already that detects glibc and its version, I have some local (but semi-unrelated) patches. Anyway... >> Whatever we do with this narrow patch it would be a really nice >> improvement if the test-lib.sh could fold all of these >> "instrumentations" behind a single flag, and that both it and "make >> test" would make it clear that you're testing in a slower "tracing" or >> "instrumentation" mode. >> Ditto things like chain lint and the bin-wrappers, e.g.: > > I sometimes wish there was a way to only chain lint the tests that > have changed since the last run. Mm, perhaps some make-based solution... :) I had some experiments to go even further, and have "make test" only run the tests relevant to the code that just changed, which with trace2's filenames and GCC/clang's -MF option you can bridge that gap. >> $ git hyperfine -L rev HEAD~0 -L off yes, -L cl 0,1 -L nbw --no-bin-wrappers, -s 'make CFLAGS=-O3' '(cd t && GIT_TEST_CHAIN_LINT={cl} TEST_NO_MALLOC_CHECK={off} ./t3070-wildmatch.sh {nbw})' -r 1 >> [...] >> Summary >> '(cd t && GIT_TEST_CHAIN_LINT=0 TEST_NO_MALLOC_CHECK=yes ./t3070-wildmatch.sh --no-bin-wrappers)' in 'HEAD~0' ran >> 1.23 times faster than '(cd t && GIT_TEST_CHAIN_LINT=0 TEST_NO_MALLOC_CHECK=yes ./t3070-wildmatch.sh )' in 'HEAD~0' >> 1.30 times faster than '(cd t && GIT_TEST_CHAIN_LINT=1 TEST_NO_MALLOC_CHECK=yes ./t3070-wildmatch.sh --no-bin-wrappers)' in 'HEAD~0' >> 1.54 times faster than '(cd t && GIT_TEST_CHAIN_LINT=1 TEST_NO_MALLOC_CHECK=yes ./t3070-wildmatch.sh )' in 'HEAD~0' >> 1.63 times faster than '(cd t && GIT_TEST_CHAIN_LINT=0 TEST_NO_MALLOC_CHECK= ./t3070-wildmatch.sh --no-bin-wrappers)' in 'HEAD~0' >> 1.87 times faster than '(cd t && GIT_TEST_CHAIN_LINT=0 TEST_NO_MALLOC_CHECK= ./t3070-wildmatch.sh )' in 'HEAD~0' >> 1.92 times faster than '(cd t && GIT_TEST_CHAIN_LINT=1 TEST_NO_MALLOC_CHECK= ./t3070-wildmatch.sh --no-bin-wrappers)' in 'HEAD~0' >> 2.24 times faster than '(cd t && GIT_TEST_CHAIN_LINT=1 TEST_NO_MALLOC_CHECK= ./t3070-wildmatch.sh )' in 'HEAD~0' >> I.e. between this, chain lint and bin wrappers we're coming up on >> our >> tests running almost 3x as slow as they otherwise could *by default*. >> But right now knowing which things you need to chase around to turn >> off >> if you're just looking to test the semantics of your code without all >> this instrumentation is a matter of archane knowledge, I'm not even sure >> I remembered all the major ones (I didn't know about this one until >> today). > > That is quite a difference in run time - I wonder how much scope there > is for optimizing some of these features like the chain-lint vs > disabling them completely. Per my series at https://lore.kernel.org/git/cover-v2-00.25-00000000000-20220325T182534Z-avarab@xxxxxxxxx/ I'd much rather see us go in the direction of mainly piggy-backing on CI for such extended testing, and just having easy to use targets for "do exhaustive tests please". E.g. so you could run "make test-like-ci" or whatever, and it would do all the N permutations we do in CI locally.