Re: Pseudo-locales for i18n testing by English speakers

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Martin Langhoff wrote:
> 2008/10/2 Sean Flanigan <sflaniga@xxxxxxxxxx>:
>> I have a simple Ant task which can generate pseudo-translations like the
>> one above from a gettext POT files,
> 
> I am after a few sets of "latin-lookalike" character tables I can use.
> Have you (or anyone) got pointers to good tables?

Well, I've made up a couple of simple ones (also attached as UTF-8):
ASCII:
"abcdefghijklmnopqrstuvwxyz"
BMP only:
"åЬçđéϝցⱨîﺩⱪŀოňøÞᕴяšŧմⱱשẋŷż"
BMP+SMP:
"åЬçđ𝖾ϝցⱨî𝚓ⱪŀოňøÞᕴяšŧմⱱשẋŷż"

You could also try googling for "LATIN SMALL LETTER {A,B,C,...} WITH",
which should turn up all sorts of modified latin characters, such as
LATIN SMALL LETTER V WITH RIGHT HOOK.

Another option is the Wikipedia Unicode pages
http://en.wikipedia.org/wiki/List_of_Unicode_characters
has several sections for extended latin scripts, and the Unicode mapping
tables down the bottom are handy if you want to go directly to a certain
Unicode range (eg to get away from the BMP).

> The simple example phrase you provided hit a bug in moodle (php
> webapp) straight away - I think a few webapps have trouble with that
> funny 'e' (U+1D5BE). Interestingly, it's also present in Jira
> (Java-based webapp). Might be an iconv issue.

I chose that 'e' specifically because it wasn't part of the BMP, but
apparently the mathematical alphanumeric symbols are a bit of a special
case - I'm not sure if systems are expected to provide font substitution
for them.

Zimbra (written in Java) had trouble with the 'e' too - it just removed
it entirely.  I think a lot of programs have trouble with characters
that don't fit into 16-bit Unicode.  My text editors and Thunderbird can
show the 'e' character, but the cursor handling is all wrong on those lines.


-- 
Sean Flanigan

Senior Software Engineer
Engineering - Internationalisation
Red Hat
ASCII:
"abcdefghijklmnopqrstuvwxyz"
BMP only:
"åЬçÄ?éÏ?Ö?ⱨîﺩⱪÅ?á??Å?øÃ?á?´Ñ?šŧմⱱשáº?ŷż"
BMP+SMP:
"åЬçÄ?ð??¾Ï?Ö?ⱨîð???ⱪÅ?á??Å?øÃ?á?´Ñ?šŧմⱱשáº?ŷż"

Attachment: signature.asc
Description: OpenPGP digital signature

--
Fedora-i18n-list mailing list
Fedora-i18n-list@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/fedora-i18n-list

[Index of Archives]     [Fedora Users]     [Fedora Desktop]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite News]     [KDE Users]     [Fedora Tools]

  Powered by Linux