On Tue, May 9, 2017 at 12:40 PM, Johannes Schindelin <Johannes.Schindelin@xxxxxx> wrote: > Hi, > > On Tue, 9 May 2017, brian m. carlson wrote: > >> On Tue, May 09, 2017 at 02:00:18AM +0200, Ævar Arnfjörð Bjarmason wrote: >> > On Tue, May 9, 2017 at 1:32 AM, brian m. carlson >> > <sandals@xxxxxxxxxxxxxxxxxxxx> wrote: >> > > PCRE and PCRE2 also tend to have a lot of security updates, so I >> > > would prefer if we didn't import them into the tree. It is far >> > > better for users to use their distro's packages for PCRE, as it >> > > means they get automatic security updates even if they're using an >> > > old Git. >> > > >> > > We shouldn't consider shipping anything with a remotely frequent >> > > history of security updates in our tree, since people very >> > > frequently run old or ancient versions of Git. >> > >> > I'm aware of its security record[1], but I wonder what threat model >> > you have in mind here. I'm not aware of any parts of git (except maybe >> > gitweb?) where we take regexes from untrusted sources. >> > >> > I.e. yes there have been DoS's & even some overflow bugs leading code >> > execution in PCRE, but in the context of powering git-grep & git-log >> > with PCRE this falls into the "stop hitting yourself" category. >> >> Just because you don't drive Git with untrusted regexes doesn't mean >> other people don't. > > Or other applications. > >> It's not a good idea to require a stronger security model than we >> absolutely have to, since people can and will violate it. Think how >> devastating Shellshock was even though technically nobody should provide >> insecure environment variables to the shell. >> >> And, yes, gitweb does in fact call git grep. That means that git grep >> must in fact be secure against untrusted regexes, or you have a remote >> code execution vulnerability. > > And not only grep is affected. Think HEAD^{/<regex>}. There are plenty of > sites where you are allowed to specify revs in a freer form than SHA-1s. That will still use reg(comp|exec) for the foreseeable future. We have plenty of manual use of that all over the place: $ git grep 'reg(comp|exec)\(' *.[ch] builtin/*.[ch] And the ^{/rx} feature is powered by the one in sha1_name.c > Having said that, I do like the prospect of a faster git grep. > > Hopefully there will be a way to make use of PCRE that can be switched > off? Like, a compile-time replacement of the regex API backed by PCRE v2 > *iff* PCRE v2 is used for building? Yup, see my just-sent <CACBZZX6V8qbnrZAdhRvPthy5Z91iEG8rrJ=Sf9tdkOt52M9j1Q@xxxxxxxxxxxxxx>. It'll be optional for now, as it's been for a while. Aside from that I do think given these numbers it's worth considering making PCRE a default dependency, and possibly getting rid of stuff like kwset because a) it reduces the many codepaths we have now of either doing fixed/basic/extended/pcre into one b) since the numbers suggest pcre can support all of that faster that seems like a sensible thing to do. But anything like that will be a few patch series's down the road, for now I'm just making it all optional.