Re: [PATCH 1/2] rpmatch.3: remove first-character-only FUD

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi!

On Tue, Sep 21, 2021 at 05:20:32PM +0200, Alejandro Colomar (man-pages) wrote:
> Are you sure?
> 
> So, it seems to me that by using {yes,no}expr and not {yes,no}str, it is
> limiting itself to the first letter, as the current BUGS section specifies.
> Right?
Quite sure:
	localedata/locales/am_ET:yesexpr "^([+1yY<U12CE>]|<U12A0><U12CE><U1295>)"
Granted, I, unfortunately, don't strictly read Aramaic
(but a cursory glance at a dictionary shows "አዎን" means yes),
but I do Ukrainian:
	localedata/locales/uk_UA:yesexpr "^([+1Yy]|[<U0422><U0442>][<U0410><U0430>][<U041A><U043A>]?)$"
which works out to
	"^([+1Yy]|[Тт][Аа][Кк]?)$"

This is odd, data-wise, but it's decidedly not just the first letter
(but it does match, what, "^y$", "^та$", and "^так$"? very odd!!).

On current glibc, if I was in a uk_UA locale,
"nyes" is -1, not 0 like this page would lead me to believe,
and, similarly, in an_ET, "አ" (-1) is not the same as "አዎን" (1).

FreeBSD (and, presumably, everyone else) uses CLDR data,
which provides something much more sensible:
  [1] ^(([yY]([eE][sS])?)|([yY]))
  [2] ^(([дД]([аА])?)|([дД])|([yY]([eE][sS])?)|([yY]))

This, admittedly, is not perfect, but the code that generates it [3]
explicitly handles full yesstr words because the data itself [4] is
constructed around yesstr, and yesexpr is a generated expression that
matches yesstr ‒ they're the same.

rpmatch() is a correct (well, /the/ correct) approach to handling this
(or, well, an equivalent on libcs that lack it, it's like seven lines) ‒
if a similar warning were prudent, and I very much believe it is /not/,
it'd belong in nl_langinfo() {YES,NO}EXPR or langinfo.h,
but it'd be a warning /for the end-user/, who, presumably,
knows the language they speak, not for the programmer.

наб

1. https://github.com/freebsd/freebsd-src/blob/373ffc62c158e52cde86a5b934ab4a51307f9f2e/share/msgdef/en_US.UTF-8.src
2. https://github.com/freebsd/freebsd-src/blob/373ffc62c158e52cde86a5b934ab4a51307f9f2e/share/msgdef/ru_RU.UTF-8.src
3. https://github.com/unicode-org/cldr/blob/62c90a357dc25911db60fcdf7d5a80119df27963/tools/cldr-code/src/main/java/org/unicode/cldr/posix/POSIXUtilities.java#L336
4. https://github.com/unicode-org/cldr/blob/62c90a357dc25911db60fcdf7d5a80119df27963/common/main/ru.xml#L15789

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Kernel Documentation]     [Netdev]     [Linux Ethernet Bridging]     [Linux Wireless]     [Kernel Newbies]     [Security]     [Linux for Hams]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux Admin]     [Samba]

  Powered by Linux