Reviewer: Mike Ounsworth Review result: Has Nits I have reviewed this document as part of the security directorate's ongoing effort to review all IETF documents being processed by the IESG. These comments were written primarily for the benefit of the security area directors. Document editors and WG chairs should treat these comments just like any other last call comments. Summary: I think the normative technical content is fine, but in my opinion the Security Considerations do not fully address the issue of Regular expression Denial of Service (ReDoS). The authors could consider expanding the Security Considerations to provide clearer usage guidance and security expectations. At the minimum, I think this document should include a security consideration about the types of "evil regexps" that are still expressible in I-Regexp and that even with this reduced syntax it is still not advisable to run arbitrary user-provided regular expressions on your hardware. Longer review: The primary security concern with a regular expression specification is "regular expression denial of service (ReDoS)" attacks, for example as described in the following OWASP article: https://owasp.org/www-community/attacks/Regular_expression_Denial_of_Service_-_ReDoS Poor implementations become vulnerable to ReDoS attacks either when a poorly-optimized regexp pattern is run on a CPU or memory constrained device over attacker-supplied input, or when an attacker is allowed to submit a regexp pattern to be run on victim hardware (regardless of amount of available resources). draft-ietf-jsonpath-iregexp does address this in both sections 6 and 8, and while I agree that not including features such as capture groups, lookahead, or backreferences does help mitigate against poor implementations containing ReDoS issues, I do not think this is a sufficient coverage of the potential security issues related to ReDoS. For example, the following "evil regexes" from the above OWASP article are all expressible in I-Regexp: "(a+)+", "([a-zA-Z]+)*", "(a|aa)+", "(a|a?)+", all of which will lead to large resource consumption when processing the input string " aaaaaaaaaaaaaaaaaaaaaaaa!". I am not sufficiently expert in this domain to know if a well-implemented regexp library would avoid being vulnerable, or if maliciously-crafted regexp patterns will always be able to cause large resource consumption regardless of the cleverness of the regexp engine (I suspect the later). The problem with these "evil regexps" is that they yield an exponential number of valid ways to parse a single input string and then provide a worst-case input string which forces it to check every possible parsing. I-Regexp is simply providing a "translation layer", so something which is evil to the underlying regexp lib will still be evil at the I-Regexp level. I think this document should at least provide a security consideration about the types of non-optimized or evil regexp patterns that are expressible in the I-Regexp syntax. Finally, I am aware that this document specifies only the regexp syntax that is meant to be an easily-convertible subset of the regexp syntaxes of existing libraries, and is not proposing a processing algorithm. So maybe my further paragraph is out-of-scope for this document. I am not sufficiently expert in regular expressions or current best-practice for writing regular expression libraries to suggest a fix, but I wonder if this document could recommend that implementations include some sort of configurable limit on nesting level or on recursion / backtracking depth. For example a limit whereby an implementation will fail with an error if a given input requires it to evaluate more than a certain number of nested clauses. Or a "backtracking limit" whereby an implementation uses the heuristic of following the longest possible expansion of a + or *, and the implementation will fail with an error if it is required to backtrack more than a given number of times to try a different decomposition path through the state machine. -- last-call mailing list last-call@xxxxxxxx https://www.ietf.org/mailman/listinfo/last-call