Re: pcre performance, was Re: git log filtering

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> Result: external grep wins hands-down. GNU regex loses hands-down. pcre 
> seems to be better than glibc's regex engine, and gains ever so slightly 
> when using NO_MMAP.

Indeed GNU regex 0.12 loses, and that's why it was rewritten for (IIRC)
glibc 2.3.  Older glibc's use code derived from GNU regex 0.12; but the
old GNU regex code is dead in general (maybe it survives in Emacs -- but
I don't remember), and the glibc regex code can be used by external
programs via gnulib.

glibc is slower than PCRE mostly because it is internationalized.  So
for example it supports things like stra[.ss.]e matching both strasse
and straße in a German locale, or [[=a=]] matching aàáäâ and possibly
more variations.  In theory.  In practice I couldn't make it work
while writing this message...

External grep wins hands-down because it's a DFA engine.  If the regex
uses backreferences (or the above esoteric constructs), however, external
grep will not be able to give a definite answer using the fast engine,
and will fall back to glibc regex.

Paolo
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]