(Sorry for the top-post, but I've had enough of this thread.) Can we take this offline to a bit-bucket somewhere, and maybe point the original poster to a basic course on language and locales and character sets? This thread has gone on far too long, and has no relevance for anyone outside of the United States and their weird and incorrect use of their own language. I'm certainly tired of the repeated assertions by the original poster that he is right and that the rest of the world needs to conform to his obviously-incorrect viewpoint. On Thu, 2011-03-10 at 15:01 +0000, Alan Cox wrote: > > Anyway, being really pedantic. In the English language, it's certainly > > I think you mean American. I doubt most people know the correct behaviour > for Ã, Ã, Ã, à and other accented letters that appear in en_GB English > words. > > > possible (and acceptable) to fold upper- down to lower-case, or vice > > versa. Other languages will have similar positions (it's acceptable and > > doable). And yes, there will be some characters that don't have > > No - in fact in English it is basically OK but in other languages it is > not. See below > > > mutually equivalent meanings, which have be treated separately, there's > > nothing new in that, either. > > Ah but you see here is one of your problems. Do you want the question > > is RPM name A == RPM name B > > to depend upon locale ? Isn't that a bit of a hazard - imagine if you > have multiple respositories and your dependancies pulled a different > package in German to English locales ? > > > There's nothing particularly special about rules that say character > > numbers so-and-so are equivalent to character numbers so-and-so, in > > sections throughout the repertoire, with other blocks of characters that > > There is a lot special. The rules for caseless comparison of the unicode > character set, case conversion and the like are huge. Some languages > don't have such a concept, some differ on how they are compared. The > comparison of accented and non-accented character variants is also a big > deal that Americanglish doesn't have to deal with but the rest of the > world does. > > Even apparently simple things like German throw in some absolute gems. > Try for example à which has no upper case ligature but translates into a > pair of 'S'. Then the fun starts - how do you determine if any given pair > of SS ligatures in German are the same as à if lower cased. Greek has > position dependant casing while in Turkish the letter I is a whole little > bomb of its own. > > > don't have equivalents. Unicode just extends the size of the > > repertoire. > > And the rule set, and the number of case forms (upper, low and title) - > see upper/lower isn't really enough. > > Welcome to planet Earth as seen by the rest of us > > At this point you hopefully start to see why "Did you mean XYZ" is much > easier to implement, and also more useful ! > > There are very very good reasons to keep "is RPM name A the same as RPM > name B" a question that is not dependant upon anything else. Case is a > concept that doesn't have that property. > > Alan -- users mailing list users@xxxxxxxxxxxxxxxxxxxxxxx To unsubscribe or change subscription options: https://admin.fedoraproject.org/mailman/listinfo/users Guidelines: http://fedoraproject.org/wiki/Mailing_list_guidelines