Re: [PATCH] userdiff: support regexec(3) with multi-byte support

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Am 07.04.23 um 09:49 schrieb René Scharfe:
> Am 07.04.23 um 00:35 schrieb Johannes Sixt:
>> This is not equivalent. The original treated a sequence of non-ASCII
>> characters as a word. The new version treats each individual non-space
>> character (both ASCII and non-ASCII) as a word.
> 
> I assume you mean "The original treated [a single non-space as well as]
> a sequence of non-ASCII characters [making up a single multi-byte
> character] as a word.".  That works as intended by 664d44ee7f (userdiff:
> simplify word-diff safeguard, 2011-01-11).

I misread the original RE. I thought it would lump multiple multi-byte
characters together into one word, but it does not; sorry for that. It
looks like your suggested replacement is behaviorally identical to the
original after all, except perhaps for this one:

> The new one doesn't match multi-byte whitespace anymore.

but I did not find a reference that confirms it. I don't think we need
to bend over backwards to keep this compatibility, though.

-- Hannes




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux