Thomas Rast wrote: > Jonathan Nieder wrote: >> + "|[^[:space:]]"), > > I think it should get the |[\x80-\xff]+ arm, too. That one was > designed to avoid splitting UTF-8 characters. At the risk of gluing > together too many of them, of course, but I think confusing the > terminal would be worse. Hmm. Should it be |([\x80-\xff]+[\x00-\x7f]) then, to match exactly one multibyte UTF-8 character? -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html