On Wed, Jul 18, 2018 at 06:12:05PM +0800, Zorro Lang wrote: > On Wed, Jul 18, 2018 at 10:47:09AM +0200, Florian Weimer wrote: > > * Zorro Lang: > > > > > On Wed, Jul 18, 2018 at 08:04:05AM +0200, Florian Weimer wrote: > > >> * Zorro Lang: > > >> > > >> >> > > This is related to this glibc bug: > > >> >> > > > > >> >> > > https://sourceware.org/bugzilla/show_bug.cgi?id=23393 > > >> >> > > >> > > >> > A stranger thing is: > > >> > egrep [A-Z] match ABCD and bcd, but not match 'a'... > > >> > > >> That's the same issue as [0-9] not matching 9. > > >> > > >> > I already can't understand the new rules ... > > >> > > >> The range operator matches characters according to their collation > > >> weight, and sincce the weight of 'a' is less than the weight of 'A', > > >> 'a' is not included in the [A-Z] range. > > > > > > How to define/calculate the *weight* in your context? Why you say the > > > weight of 'a' is less than the weight of 'A' > > > > This is a concept from POSIX collation, based on a locale definition: > > > > I hope this link is reasonably stable: > > > > <http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap07.html#tag_07_03_02> > > > > Basically, collation is in alternative way of sorting strings, > > different from codepoint order, and it is specifically designed to > > take cultural conventions into account. Traditionally, most regular > > expression range expression such as [a-z] follow collation order, > > although this is not required by POSIX for non-C/non-POSIX locales. > > > > >> This could be fixed by including all characters with the same primary > > >> weight as the endpoints (so that [ā-ẑ] and [a-z] would end up being > > >> the same). It makes the behavior more logical, but it doesn't fix > > >> existing scripts. > > > > > > We find that the $LANG will affect how glibc deal with the wildcard. > > > We all test on LANG=en_US.UTF=8, but if I set export LANG=C, then > > > [a-z] and [A-Z] are all as expected, and xfstests make install works. > > > > Right, this is expected: POSIX requires the behavior you need for the > > "C" locale. > > I was trying to change all these things to [:digit:], [:lower:], [:upper:], > [:alpha:] and [:alnum:]. But there're many, and the worse thing is there're > many things like [1-9], [3-8], [1-9a-f], [0-9a-f-] etc... > > So I have to stop, and think about if there's a better way? How about we > fix the Makefile issue by change [a-z] to [[:lower:]], then export LANG=C > in xfstests/check file, and recommand export it in local.file? > > Please tell me, if you have better idea. Another way maybe we can define SED_PROG="LC_ALL=C sed", GREP_PROG="LC_ALL=C grep" etc ... But Makefile still need to be fixed singly. > > Thanks so much, > Zorro > > > -- > > To unsubscribe from this list: send the line "unsubscribe fstests" in > > the body of a message to majordomo@xxxxxxxxxxxxxxx > > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe fstests" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html