On Wed, Jul 18, 2018 at 10:47:09AM +0200, Florian Weimer wrote: > * Zorro Lang: > > > On Wed, Jul 18, 2018 at 08:04:05AM +0200, Florian Weimer wrote: > >> * Zorro Lang: > >> > >> >> > > This is related to this glibc bug: > >> >> > > > >> >> > > https://sourceware.org/bugzilla/show_bug.cgi?id=23393 > >> >> > >> > >> > A stranger thing is: > >> > egrep [A-Z] match ABCD and bcd, but not match 'a'... > >> > >> That's the same issue as [0-9] not matching 9. > >> > >> > I already can't understand the new rules ... > >> > >> The range operator matches characters according to their collation > >> weight, and sincce the weight of 'a' is less than the weight of 'A', > >> 'a' is not included in the [A-Z] range. > > > > How to define/calculate the *weight* in your context? Why you say the > > weight of 'a' is less than the weight of 'A' > > This is a concept from POSIX collation, based on a locale definition: > > I hope this link is reasonably stable: > > <http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap07.html#tag_07_03_02> > > Basically, collation is in alternative way of sorting strings, > different from codepoint order, and it is specifically designed to > take cultural conventions into account. Traditionally, most regular > expression range expression such as [a-z] follow collation order, > although this is not required by POSIX for non-C/non-POSIX locales. > > >> This could be fixed by including all characters with the same primary > >> weight as the endpoints (so that [ā-ẑ] and [a-z] would end up being > >> the same). It makes the behavior more logical, but it doesn't fix > >> existing scripts. > > > > We find that the $LANG will affect how glibc deal with the wildcard. > > We all test on LANG=en_US.UTF=8, but if I set export LANG=C, then > > [a-z] and [A-Z] are all as expected, and xfstests make install works. > > Right, this is expected: POSIX requires the behavior you need for the > "C" locale. I was trying to change all these things to [:digit:], [:lower:], [:upper:], [:alpha:] and [:alnum:]. But there're many, and the worse thing is there're many things like [1-9], [3-8], [1-9a-f], [0-9a-f-] etc... So I have to stop, and think about if there's a better way? How about we fix the Makefile issue by change [a-z] to [[:lower:]], then export LANG=C in xfstests/check file, and recommand export it in local.file? Please tell me, if you have better idea. Thanks so much, Zorro > -- > To unsubscribe from this list: send the line "unsubscribe fstests" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe fstests" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html