Ding-Yi Chen wrote: > The pseudo locale is intriguing, and I assume it helps at some degree. > However, this approach does have its own limitation: Of course, pseudo-localisation testing is not the same as localisation testing in every Fedora language, but it's something! > 1. Lack of font support: as the attachment "lack_of_font.png" shows, the > pseudo locale might be rendered useless if all developers can see are > unicode boxes. :-P That tells me that the developers should install better fonts, or how else can they test an internationalised application? But to be honest, I probably shouldn't have used http://en.wikipedia.org/wiki/Mathematical_alphanumeric_symbols since they're only guaranteed to be available in certain mathematical fonts such as Code2001. I really need to find some latinesque characters that don't come from the BMP, nor from the maths section! Apparently Zimbra loses (without trace) the 'e' characters in my pseudotranslation. Bad Zimbra! As long as it's only a couple of characters, I think having some unusual characters is okay, since you can still work out what's going on, at least enough to resolve the problem by installing more fonts. > Perhaps we should specifiy the minimal font set as > remedy. Before running pseudo-localised apps, you mean? Good idea. I found a webapp that gives the names of unicode characters - <http://rishida.net/scripts/uniview/uniview.php>. Just paste text into the "cut & paste" field and hit enter. But how can I find the name of the font which provides a given character? I can tell you that all my pseudo-characters are readable on my computer, but I can't tell you where they come from. Once I work out what fonts my pseudo-locale requires, I'd be happy to share the info as a dependency list. Perhaps it would make sense to define a small Fedora package which specified certain Unicode fonts as dependencies, as well as enabling the hypothetical pseudo-locale support in glibc. > 2. It doesn't really solve the language specific problem. Take Chinese > characters sorting for example, they can be sorted by > Pinyin, Zhuyin, radical, number of strokes, and "natural" order such as > numberial characters. The sorting is impossible to verify without the > knowledge. True, but a pseudo-locale which uses reverse sorting can at least show up whether an app is using internationalised sorting, or plain old ASCII ordering. And we're not limited to what Microsoft did - I don't know much about Chinese character sorting, but we could probably come up with a couple of alternative sorts that could be understood by an English-speaking developer. But I don't want to tackle that just yet! > Still, the idea itself is good. And surely it filters out some of the > bugs without the help of translaters. I expect a lot of i18n/L10n bugs are not picked up until someone tests one of the affected languages. Some of those bugs could show up in a pseudo-locale much earlier, which has to be an improvement. For instance, I've already found bugs where Eclipse and joe mess up the cursor position when editing SMP characters, without personally knowing any SMP languages. As an English-only developer I think it's also pretty cool to see if my code is at least partly internationalised, which otherwise I can't see for myself at all, except in a foreign language. I think some English-only developers might take more interest in i18n issues if they could easily see the results for themselves. And for those i18n issues which can be demonstrated with a pseudo-locale, it can be easier for multiple developers to talk about something which is in "English", since most developers speak English, even if they have differing native languages. > Since the main purpose of pseudo locale is for testing, shall we agree > on a list of pseudo locales which have their own specified behaviour? I think it would be good if we could fit in with Vista's chosen pseudo-locale IDs, as listed here: http://blogs.msdn.com/shawnste/archive/2006/06/27/647915.aspx As I said, we certainly don't have to emulate MS completely, but I think we should use qps for the language code. See http://blogs.msdn.com/michkap/archive/2007/02/04/1596987.aspx As for the behaviours, I expect that they will change as we learn more from testing feedback, but here are some ideas: a. simple character substitution, rendered text to be about the same size b. character substitution with expansion (eg "[--- original text ---]") to make strings longer c. maybe swapping upper and lower case. Sometimes it's handy to have more than one pseudo-locale, eg to make sure a web client is not seeing the server's locale, so having spare locales might be handy. And we could have options like different sort orders. But I'd be happy to start with (a) or (b) and leave sort orders until a bit later. At least with (a) and (b) it's easy to see whether someone forgot to call gettext(), because the plain English strings will stick out. -- Sean Flanigan Senior Software Engineer Engineering - Internationalisation Red Hat
Attachment:
signature.asc
Description: OpenPGP digital signature
-- Fedora-i18n-list mailing list Fedora-i18n-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/fedora-i18n-list