[RFC/PATCH] userdiff.c: Avoid old glibc regex bug causing t4034-*.sh test failures

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



In particular, this bug affects the word-diff regex for 'bibtex' and
'html', leading to the test failures in t4034-diff-words.sh. The bug
is described here:

    http://sourceware.org/bugzilla/show_bug.cgi?id=3957

and was fixed on 12-07-2007. In summary, when the REG_NEWLINE flag is
passed to regcomp(), a non-matching list ([^...]) not containing a
newline should not match a newline. However, in some old versions of
the glibc regex library, the newline character was indeed matched.

In order to fix the problem, we add an explicit '\n' to the list in
the non-matching list expression.

Signed-off-by: Ramsay Jones <ramsay@xxxxxxxxxxxxxxxxxxx>
---

Junio,
    I recently mentioned that a couple of tests in t4034-*.sh were
failing for me on Linux. I have now looked into it, and the problem
turned out to be an old bug in the glibc regex routines. :-(

This is an RFC because:
    - A simple fix would be for me to put NO_REGEX=1 in my config.mak,
      since the compat/regex routines don't suffer this problem.
    - I suspect this bug is old enough that it will not affect many users.
    - I have not audited the other non-matching list expressions in
      userdiff.c
    - blame, grep and pickaxe all call regcomp() with the REG_NEWLINE
      flag, but get the regex from the user (eg from command line).

ATB,
Ramsay Jones

 userdiff.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/userdiff.c b/userdiff.c
index 1ff4797..2f9ba37 100644
--- a/userdiff.c
+++ b/userdiff.c
@@ -28,7 +28,7 @@ IPATTERN("fortran",
 	 "|[-+]?[0-9.]+([AaIiDdEeFfLlTtXx][Ss]?[-+]?[0-9.]*)?(_[a-zA-Z0-9][a-zA-Z0-9_]*)?"
 	 "|//|\\*\\*|::|[/<>=]="),
 PATTERNS("html", "^[ \t]*(<[Hh][1-6][ \t].*>.*)$",
-	 "[^<>= \t]+"),
+	 "[^<>= \t\n]+"),
 PATTERNS("java",
 	 "!^[ \t]*(catch|do|for|if|instanceof|new|return|switch|throw|while)\n"
 	 "^[ \t]*(([A-Za-z_][A-Za-z_0-9]*[ \t]+)+[A-Za-z_][A-Za-z_0-9]*[ \t]*\\([^;]*)$",
@@ -94,7 +94,7 @@ PATTERNS("ruby", "^[ \t]*((class|module|def)[ \t].*)$",
 	 "|[-+0-9.e]+|0[xXbB]?[0-9a-fA-F]+|\\?(\\\\C-)?(\\\\M-)?."
 	 "|//=?|[-+*/<>%&^|=!]=|<<=?|>>=?|===|\\.{1,3}|::|[!=]~"),
 PATTERNS("bibtex", "(@[a-zA-Z]{1,}[ \t]*\\{{0,1}[ \t]*[^ \t\"@',\\#}{~%]*).*$",
-	 "[={}\"]|[^={}\" \t]+"),
+	 "[={}\"]|[^={}\" \t\n]+"),
 PATTERNS("tex", "^(\\\\((sub)*section|chapter|part)\\*{0,1}\\{.*)$",
 	 "\\\\[a-zA-Z@]+|\\\\.|[a-zA-Z0-9\x80-\xff]+"),
 PATTERNS("cpp",
-- 
1.7.5


--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]