Re: [PATCH 0/2] perf: show that wildmatch() regressed for pathological cases in v2.0

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 04/24, Ævar Arnfjörð Bjarmason wrote:
> Russ Cox just published an article about how various glob()
> implementations suffer from pathological performance when fed certain
> pathological patterns like "a*a*a*a*b" given a file like "aaaaaaa...":
> https://research.swtch.com/glob
> 
> I was curious to see if this impacted git. It turns out it does. This
> used to be a per-platform issue with git, since globbing was
> implemented via fnmatch() by default before v1.8.4, and support for
> using the OS fnmatch() was removed entirely in v2.0.0.
> 
> This performance test shows the regression:
> 
>     $ GIT_PERF_REPEAT_COUNT=1 GIT_PERF_MAKE_OPTS="[...]NO_WILDMATCH=YesPlease[...]" ./run v1.9.5 v2.0.0 v2.12.0 p0100-globbing.sh
>     [...]
>     Test                                                       v1.9.5            v2.0.0                    v2.12.0
>     ------------------------------------------------------------------------------------------------------------------------------
>     [...]
>     0100.7: fileglob((a*)^nb) against file (a^100).t; n = 1    0.01(0.00+0.00)   0.00(0.00+0.00) -100.0%   0.01(0.00+0.00) +0.0%
>     0100.8: fileglob((a*)^nb) against file (a^100).t; n = 2    0.01(0.00+0.00)   0.00(0.00+0.00) -100.0%   0.01(0.00+0.00) +0.0%
>     0100.9: fileglob((a*)^nb) against file (a^100).t; n = 3    0.00(0.00+0.00)   0.00(0.00+0.00) =         0.01(0.00+0.00) +inf
>     0100.10: fileglob((a*)^nb) against file (a^100).t; n = 4   0.00(0.00+0.00)   0.01(0.01+0.00) +inf      0.02(0.02+0.00) +inf
>     0100.11: fileglob((a*)^nb) against file (a^100).t; n = 5   0.00(0.00+0.00)   0.20(0.19+0.00) +inf      0.24(0.23+0.00) +inf
>     0100.12: fileglob((a*)^nb) against file (a^100).t; n = 6   0.00(0.00+0.00)   3.03(3.00+0.00) +inf      3.08(3.05+0.00) +inf
> 
> And here's a one-liner to do the same:
> 
>     $ time (rm -rf test; git init -q test && (cd test && touch $(perl -e 'print "a" x 100').t && git add a* && git commit -q -m"test" && git ls-files 'a*a*a*a*a*a*a*b'))
> 
> Add or remove "a*"'s to adjust the runtime. With 6 that executes in 3
> seconds on my system, 40 seconds with 7 etc.
> 
> I don't think this is something we need to worry much about, if you
> have a file like this and feed Git insane patterns you probably
> deserve what you get.
> 
> The real concern is if we have behavior like this and ever e.g. expose
> globbing over the network, e.g. in some future upload-pack protocol.
> 
> There are probably some web-based programs built on top of git that
> are vulnerable to DoS attacks as a result of this, e.g. if they take
> user-supplied globs and feed them to ls-files.

I was taking a look at wildmatch a few months ago and have an unfinished
patch to do some cleanup there.  I noticed this was inefficient but
didn't expect those kinds of numbers.  I wonder how difficult it would
be to rewrite it so that we don't have this issue.

-- 
Brandon Williams



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]