https://bugzilla.redhat.com/show_bug.cgi?id=868578 Bug ID: 868578 QA Contact: extras-qa@xxxxxxxxxxxxxxxxx Severity: medium Version: rawhide Priority: medium CC: notting@xxxxxxxxxx, package-review@xxxxxxxxxxxxxxxxxxxxxxx Assignee: nobody@xxxxxxxxxxxxxxxxx Summary: Review Request: re2 - C++ fast alternative to backtracking RE engines Regression: --- Story Points: --- Classification: Fedora OS: Linux Reporter: denis.arnaud_fedora@xxxxxxx Type: --- Documentation: --- Hardware: All Mount Type: --- Status: NEW Component: Package Review Product: Fedora Spec URL: http://denisarnaud.fedorapeople.org/re2/re2.spec SRPM URL: http://denisarnaud.fedorapeople.org/re2/re2-0.0.0-1.fc17.src.rpm Fedora Account System Username: denisarnaud Description: RE2 is a fast, safe, thread-friendly alternative to backtracking regular expression engines like those used in PCRE, Perl, and Python. It is a C++ library. Backtracking engines are typically full of features and convenient syntactic sugar but can be forced into taking exponential amounts of time on even small inputs. RE2 uses automata theory to guarantee that regular expression searches run in time linear in the size of the input. RE2 implements memory limits, so that searches can be constrained to a fixed amount of memory. RE2 is engineered to use a small fixed C++ stack footprint no matter what inputs or regular expressions it must process; thus RE2 is useful in multithreaded environments where thread stacks cannot grow arbitrarily large. On large inputs, RE2 is often much faster than backtracking engines; its use of automata theory lets it apply optimizations that the others cannot. Unlike most automata-based engines, RE2 implements almost all the common Perl and PCRE features and syntactic sugars. It also finds the leftmost-first match, the same match that Perl would, and can return submatch information. The one significant exception is that RE2 drops support for backreferences¹ and generalized zero-width assertions, because they cannot be implemented efficiently. The syntax page gives full details. For those who want a simpler syntax, RE2 has a POSIX mode that accepts only the POSIX egrep operators and implements leftmost-longest overall matching. To get started writing programs that use RE2, see the C++ API description. For details about the implementation, see "Regular Expression Matching in the Wild." Technical note: there's a difference between submatches and backreferences. Submatches let you find out what certain subexpressions matched after the match is over, so that you can find out, after matching dogcat against (cat|dog)(cat|dog), that \1 is dog and \2 is cat. Backreferences let you use those subexpressions during the match, so that (cat|dog)\1 matches catcat and dogdog but not catdog or dogcat. RE2 supports submatch extraction, but not backreferences. If you absolutely need backreferences and generalized assertions, then RE2 is not for you, but you might be interested in irregexp, Google Chrome's regular expression engine. -- You are receiving this mail because: You are on the CC list for the bug. _______________________________________________ package-review mailing list package-review@xxxxxxxxxxxxxxxxxxxxxxx https://admin.fedoraproject.org/mailman/listinfo/package-review