On Thu, 10 Mar 2011, Alan Cox wrote:
The issue of what becomes what for purposes of comparison
is easliy settled by asking a genealogist.
Genealogists have been handling such problems for years.
More focussed on time shifting (eg à to th) but you could if you really
You don't need to ask them however, librarians have been dealing with
case, sort order and the joys of "How do I file a book with a mixed
latin and greek title" long before the Columbus went yachting,
For the joy of case conversion see Unicode Standard 6.0, UCB case
mappings and the supporting Annex (#31 if I remember rightly). Assuming
> Ah but you see here is one of your problems. Do you want the question
> is RPM name A == RPM name B
> to depend upon locale ? Isn't that a bit of a hazard - imagine if you
Only if one blindly relies on the answer.
I'd recommend against relying on fuzzy comparisons to do updates.
Any locale dependant comparison is by definition not a single mapping
across all systems. Any case ignoring comparison is locale dependant - the
standards decree this.
So the only non fuzzy comparison you can rely on is a case dependant one.
Fuzzy is what is being asked for.
put every UTF character in its own bin
for U in UTF_characters :
for Loc in Locales :
for V in UTF_characters :
if U and V match in Loc :
merge bin of U with bin of V
For fuzzy matching purposes, characters in the same bin match.
Michael hennebry@xxxxxxxxxxxxxxxxxxxxx
"Pessimist: The glass is half empty.
Optimist: The glass is half full.
Engineer: The glass is twice as big as it needs to be."
users mailing list
To unsubscribe or change subscription options:
Guidelines: http://fedoraproject.org/wiki/Mailing_list_guidelines