Hi Ping
On 01/12/2022 07:33, Ping Yin wrote:
If the rule is "break on ascii whitespace",
Is there a way to achieve this: break english by word, and break
chinese by utf-8 character
You could extend your current regex so that it matches whole utf-8
codepoints which is what git does for the builtin userdiff regexes. I've
not tested it but I think
git config --global diff.wordregex "[[:alnum:]_]+|[^[:space:]]|$(printf
'[\xc0-\xff][\x80-\xbf]+')"
should work. The downside is that you end up with a .gitconfig that is
not valid utf-8. Perhaps someone else has a clever idea to get around that.
Best Wishes
Phillip