Am 10.08.19 um 05:05 schrieb Carlo Arenas: > in macOS (obviously testing without NED) the following is the output > of (a hacked version) of p7801 for maint (against chromium's > repository), with René's patch on top Do you mean p7820? And what did you change? Looking at the results you removed basic and extended from the list of regex engines, right? Ugh, cloning https://chromium.googlesource.com/chromium/src.git sends more than 16GB across the wire. Is that even the right repository? Not sure if my machine can keep the relevant parts cached while grepping -- I/O times could drown out any difference due to context allocation and memory allocator choice. Let's see... > > Test HEAD^ HEAD > -------------------------------------------------------------------------------------- > 7820.1: perl grep 'how.to' 0.51(0.35+1.11) > 0.48(0.33+1.16) -5.9% > 7820.2: perl grep '^how to' 0.47(0.33+1.08) > 0.45(0.34+1.11) -4.3% > 7820.3: perl grep '[how] to' 0.49(0.40+1.11) > 0.53(0.41+1.13) +8.2% > 7820.4: perl grep '(e.t[^ ]*|v.ry) rare' 68.72(68.77+1.14) > 72.10(72.15+1.20) +4.9% > 7820.5: perl grep 'm(ú|u)lt.b(æ|y)te' 0.48(0.35+1.12) > 0.50(0.40+1.23) +4.2% > > and this is with my squashed[2] changed on top of that : > > Test HEAD^ HEAD > -------------------------------------------------------------------------------------- > 7820.1: perl grep 'how.to' 0.48(0.36+1.16) > 0.46(0.33+1.09) -4.2% > 7820.2: perl grep '^how to' 0.45(0.34+1.12) > 0.42(0.29+0.99) -6.7% > 7820.3: perl grep '[how] to' 0.48(0.40+1.13) > 0.52(0.43+1.16) +8.3% > 7820.4: perl grep '(e.t[^ ]*|v.ry) rare' 69.12(69.10+1.07) > 69.19(69.19+1.18) +0.1% > 7820.5: perl grep 'm(ú|u)lt.b(æ|y)te' 0.49(0.38+1.17) > 0.46(0.35+1.13) -6.1% > > the degenerate case is not something we can't fix anyway, since it is > likely a locking issue inside PCRE2 (I see at most 1 CPU doing work), > and the numbers are noisy because of the other problems I mentioned > before (hardcoded to 8 threads, running in a laptop with low number of > cores), which is why testing for performance regressions in windows is > strongly encouraged, regardless > > Carlo > > [1] https://public-inbox.org/git/CAPUEspgH1v1zo7smzQWCV4rX9pKVKLV84gDSfCPdT7LffQxUWw@xxxxxxxxxxxxxx/ > [2] https://public-inbox.org/git/20190810030315.7519-1-carenas@xxxxxxxxx/ > So I pointed GIT_PERF_LARGE_REPO to the monster mentioned above, ran the test once for warmup and here are the results of the second run: Test origin/master pcre2-xmalloc pcre2-xmalloc+nedmalloc --------------------------------------------------------------------------------------------------------------------- 7820.1: basic grep 'how.to' 1.59(2.93+1.75) 1.60(3.04+1.74) +0.6% 1.64(2.87+1.90) +3.1% 7820.2: extended grep 'how.to' 1.59(2.98+1.66) 1.55(2.83+1.76) -2.5% 1.67(3.15+1.70) +5.0% 7820.3: perl grep 'how.to' 1.25(1.21+2.13) 1.25(1.24+2.08) +0.0% 1.29(1.32+2.08) +3.2% 7820.5: basic grep '^how to' 1.52(2.82+1.66) 1.51(2.68+1.77) -0.7% 1.64(3.07+1.69) +7.9% 7820.6: extended grep '^how to' 1.57(2.84+1.75) 1.51(2.76+1.73) -3.8% 1.61(2.95+1.75) +2.5% 7820.7: perl grep '^how to' 1.21(1.15+2.10) 1.22(1.26+1.98) +0.8% 1.27(1.22+2.09) +5.0% 7820.9: basic grep '[how] to' 1.95(4.51+1.68) 1.96(4.48+1.69) +0.5% 2.00(4.66+1.65) +2.6% 7820.10: extended grep '[how] to' 1.96(4.54+1.65) 1.94(4.46+1.70) -1.0% 2.04(4.78+1.65) +4.1% 7820.11: perl grep '[how] to' 1.29(1.58+1.88) 1.28(1.50+1.94) -0.8% 1.34(1.51+2.06) +3.9% 7820.13: basic grep '\(e.t[^ ]*\|v.ry\) rare' 8.17(13.18+1.50) 8.29(13.36+1.37) +1.5% 8.31(13.33+1.60) +1.7% 7820.14: extended grep '(e.t[^ ]*|v.ry) rare' 8.13(13.03+1.59) 8.14(13.12+1.47) +0.1% 8.30(13.35+1.56) +2.1% 7820.15: perl grep '(e.t[^ ]*|v.ry) rare' 34.96(35.80+1.68) 34.99(35.60+1.91) +0.1% 35.18(35.83+1.90) +0.6% 7820.17: basic grep 'm\(ú\|u\)lt.b\(æ\|y\)te' 1.57(3.03+1.64) 1.53(2.76+1.75) -2.5% 1.60(2.89+1.77) +1.9% 7820.18: extended grep 'm(ú|u)lt.b(æ|y)te' 1.52(2.83+1.69) 1.52(2.89+1.63) +0.0% 1.58(2.80+1.84) +3.9% 7820.19: perl grep 'm(ú|u)lt.b(æ|y)te' 1.20(1.25+2.02) 1.21(1.30+1.96) +0.8% 1.25(1.22+2.11) +4.2% pcre2-xmalloc is my patch on top of master, +nedmalloc has the warning fixes I sent earlier and sets USE_NED_MALLOC. I don't understand why my performance is lower by factor 2.5 than yours for all perl regex tests except 7820.15 (your 7820.4), where my system is two times faster. Debian Testing, GCC 9.1.0, i5-9600K, 16GB RAM. Anyway, nedmalloc is slower across the board, but the impact of my patch is in the noise. Right? René