2009/2/7 Jakub Narebski <jnareb@xxxxxxxxx>: > On Sat, 7 Feb 2009, demerphq wrote: >> 2009/2/6 Jakub Narebski <jnareb@xxxxxxxxx>: >>> Dnia piątek 6. lutego 2009 10:49, Rafael Garcia-Suarez napisał: >>>> 2009/2/6 Jakub Narebski <jnareb@xxxxxxxxx>: > >>>>> Make SHA-1 regexp to be turned into hyperlink (the SHA-1 committag) >>>>> to match word boundary at the beginning and the end. This way we >>>>> reduce number of false matches, for example we now don't match >>>>> 0x74a5cd01 which is hex decimal (for example memory address), >>>>> but is not SHA-1. >>>> >>>> Further suggestion: you could also turn the final \b into (\b|\@), >>> >>> You meant \b -> \b(?!\@), didn't you? Word boundary _not_ followed >>> by '@', and not word boundary _OR_ '@' as you wrote... >> >> Since \b(?!\@) is effectively two zero width negative assertions in a >> row you could simplify by saying: >> >> (?![^\w\@]) > > I don't know if "sth\b" is effectively "sth(!?[^\w])"... perhaps it is. Sorry, my bad, that is double negation, I meant (?![\w\@]) On of the ways you can express \b is as: (?:(?<=\w)(?!\w)|(?<=\W)(?!\W)|\A) But the point here is you are looking for the end of a hex sequence, so you can just use the "end of string" bit of the alternation which is: (?!\w). >> >> and that way you can easily add the '.' case as well. > > We cannot add '.' case, because it there can be legitimate SHA-1 match > ending sentence, e.g. > > ... at commit 8457bb9e. /(?<!\w)([a-fA-F0-9]+)(?!(?:\.\w|[\w@]))/ :-) Yves -- perl -Mre=debug -e "/just|another|perl|hacker/" -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html