Johannes Sixt <j6t@xxxxxxxx> writes: > Am 18.02.25 um 20:30 schrieb Junio C Hamano: >> Moumita <dhar61595@xxxxxxxxx> writes: >>> /* -- */ >>> - /* Characters not in the default $IFS value */ >>> - "[^ \t]+"), >> >> We used to pretty-much use "a run of non-whitespace characters is a >> token". Now we are a bit more picky. >> >> Which may or may not be good, but it is hard to tell if it is an >> improvement. > > It is only a stand-in, because every built-in userdiff driver must have > a word pattern. Yeah, I know. I was merely saying that it was not obvious that the new pattern, which is way more elaborate, is improvement over that stand-in pattern. As these patterns are meant to be applied to only syntactically valid text, by going more specific pattern from simple and lenient pattern, we stop recognising some word that we used to take as a word (i.e. specific patterns need to worry about false negatives, while simpler patterns only have to avoid egregious false positives). > See the old thread here: > https://lore.kernel.org/git/373640ea4d95f3b279b9d460d9a8889b4030b4e9.camel@xxxxxxxxxxxx/ Yup, 2ff6c346 (userdiff: support Bash, 2020-10-22) is where the stand-in pattern came from, which is the v3 iteration of that patch. Thanks.