thank you sharing your experience and path for bettering local language support on FLOSS OSs. I do appreciate the diversities and complexities for text layout problems for mostly existing languages. And also, nothing would have happened if there was no one who endured the pains and devoted his efforts for improving the situation.
As I said in the first email, I am new to package maintaining and communication to upstream developers. I might have underestimated the problem and fired at the wrong directions. If that is the case, then please forgive me.
Go back to the digit font change issue as we discussed earlier, I spent some time in the past few days, trying to get myself a more clear
picture on this. I dug out some bug reports from various bugzillas (Mozilla, Redbat, Gnome) and gathered a list of similar reports (see the bottom of the email). These reports were filed from simplified and traditional Chinese users and Japanese users (I believed Korean experienced the same problem). So, one thing that can be said from this list is that the "contextual font selection" does seem to be bothering CJK users in text formatting.
I understand that "contextual shaping" is one of the techniques for rendering complex scripts. I am not sure how tight is the connection between "contextual shaping" and the "contextual format propagation", but one thing that I think may put some light to the complains of the CJK users is that Chinese (maybe Japanese as well) scripts are not contextual sensitive. Chinese characters are relatively independent and self-consistent in shapes (while, this statement is not true for Chinese calligraphy, where strokes may connect between characters depending on layout direction, but the current OSs and font technologies are not ready to handle this IMO). The only complexities may come from the fact that Hanzi for printing are mostly equal-width, and the punctuations among the Hanzi are expected to match the width of the surrounding Hanzi. As the full-width punctuations being encoded separately by Unicode, together with the contextual punctuation support of the input-methods, this seems to be handled very well. So, in short, for Chinese text layout, users are generally not expected to see contextual-based changes, either encoding/glyph or font faces (this may not include some extreme cases).
Now go back to pango, from what I read from the bug reports, pango uses PANGO_SCRIPT_COMMON to represent language-independent symbols. I have no complain about that. It is a good classification based on the semantics of the symbols. What I, and most CJK users, are not satisfied with is the contextual-sensitivity of those common scripts when for mating text under cjk locales. I know that you have advocated to stick with the "face" meaning of SCRIPT_COMMON, which is supposedly to be rendered by local languages. But IMO, the face meaning is misleading here. From a Chinese user perspective, the difference between the SCRIPT_COMMON to Latin is negligible, compared with its difference to CJK characters. Therefore, using CJK fonts to render SCRIPT_COMMON is quite odd. Using Latin fonts for COMMON is most preferred; even specifying no face ( i.e. using system fall-back) is better than assigning Chinese fonts for these scripts for that most Chinese fonts have low-quality Latin/common glyphs, even the commercial ones.
As you see from the bug lists, this problem has existed for many years, and I am pretty sure that it will come back again and again, as long as the expected rendering is not achieved. If the current pango formatting logic is not sufficient to handle the CJK preferences as said above, I think to refine the logic to take it into consideration is better than stick with a fixed but incomplete logic.
please let me know your thoughts and reasoning on whether this is feasible or not, if yes, where to get start.
thank you for paying attention to this issue.
Qianqian
===============================================================
Bug 321113 - Wrong glyph subsituation algorithm for digital characters and punctuations
http://bugzilla.gnome.org/show_bug.cgi?id=321113
Bug 345072 - changes font when typing different scripts on the same line
http://bugzilla.gnome.org/show_bug.cgi?id=345072
Bug 345386 - Language and direction propagation in and between PangoLayouts
http://bugzilla.gnome.org/show_bug.cgi?id=345386 (opened by yourself)
https://bugzilla.redhat.com/bugzilla/attachment.cgi?id=103679
Bug 481210 - [All lang] [firefox] - Face of the number is changing when enter number + Char, in any Locale
http://bugzilla.gnome.org/show_bug.cgi?id=481210
Bug 481188 - ascii text space too narrow for Chinese encodings
http://bugzilla.gnome.org/show_bug.cgi?id=481188
Bugzilla Bug 129541: changes font when typing different scripts on the same line
https://bugzilla.redhat.com/show_bug.cgi?id=129541
Bugzilla Bug 131218: [RHEL4] Characters get truncated in new pango
https://bugzilla.redhat.com/show_bug.cgi?id=131218
Bugzilla Bug 149991: [CJK pango] digits and punctuation in textbox give bad eol rendering and cursor placement
https://bugzilla.redhat.com/show_bug.cgi?id=149991 (filed by Jens Petersen)
https://bugzilla.redhat.com/show_bug.cgi?id=220885 (broken link)
Bugzilla Bug 228804: [All lang] [firefox] - Face of the number is changing when enter number + Char, in any Locale
https://bugzilla.redhat.com/show_bug.cgi?id=228804
Bugzilla Bug 221361: [pango] ascii text space and punctuation is narrow for CJK
https://bugzilla.redhat.com/show_bug.cgi?id=221361
Bug 379125 - chinese punctuations after english letters are wrongly displayed
https://bugzilla.mozilla.org/show_bug.cgi?id=379125
https://bugzilla.mozilla.org/attachment.cgi?id=263185
===============================================================
On Dec 7, 2007 2:41 AM, Behdad Esfahbod <
behdad@xxxxxxxxxx> wrote:
Hi Qianqian,
[/me tries to write a motivational mail]
It's easy to assume that one's problems are harder than others'. In
this case, Chinese for example is a far easier script to support than
Middle-Eastern scripts and definitely far easier than Indic scripts. Or
in Iran, my native country, less than half of Iranians know enough
English to be able to communicate at all, let alone preferring it...
When I started working on Persian support in software back in 1999, it
was a disaster. IE5 had just came out and had support for Unicode, but
had a serious bug with the letter Persian Yeh that made it almost
unusable for Persian. The community started using Arabic Yeh instead,
and many individuals and companies produced fonts that had the shape of
Persian Yeh in their Arabic Yeh glyph position. That's not the only
problem that needed to be worked around.
In the mean time, some of us started the FarsiWeb Project to
systematically work on properly fixing Persian support in software. We
soon got attracted to Free Software as there was not much we could do
about proprietary ones other than reporting the bug (that particular IE
bug took more than 4 years to fix...). Persian support in Free
Software was even worse. Both KDE and GNOME had just added support for
Arabic, but no Persian-specific feature was working. And there were no
suitable fonts. No keyboard layout either. No translations whatsoever.
Lots and lots of bugs in right-to-left UIs. The list goes on and on...
While trying to learn the culture of upstream in FarsiWeb, we learned
about similar projects in other countries that shared a bunch of those
problems with us, namely, Arabeyes from all over the Arab countries and
Ivrix from Israel. We worked on a lot of projects and patches together,
with the main goal of *fixing upstream*. To make this mail short, fast
forward a few years later and I now maintain Pango and HarfBuzz,
comaintain cairo, hack on Gtk+, Fontconfig, and Mozilla/Firefox, and the
Linux desktop has the best Persian support among all modern operating
systems. We've come a long way, and there's still a lot left to go...
Sorry if it was too personal and history, thought that may resonance
with your feelings.
Regards,
--
behdad
http://behdad.org/
...very few phenomena can pull someone out of Deep Hack Mode, with two
noted exceptions: being struck by lightning, or worse, your *computer*
being struck by lightning. -- Matt Welsh
_______________________________________________ Fedora-fonts-list mailing list Fedora-fonts-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/fedora-fonts-list