Re: IANA, Unicode, and the multilingual Internet

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Dear Martin,
Thank you for your comment. It makes plain we belong to two different worlds. My concern is the interoperability of these two worlds. My problem is your difficulty to realise that your world is not the only one on earth. Let go through your mail to try to understand why.

At 11:17 24/07/2006, Martin Duerst wrote:
At 04:05 06/07/23, JFC Morfin wrote:

>4.3. IANA registries. .... In the case of IANA registries there is no market alternative [we saw that in the alt-root case]. The control of a IANA registry can therefore be strategic. Until now the IANA had three main areas: numbers, names, protocol parameters. The numbers/names are pure Internet issues but were considered sensible enough to be delegated to ICANN. The new area of languages

This is not a new area. IANA has managed a language tag registry
since around 1995 (see RFC 1766). But it is important to note that
IANA just registers language tags (or since recently, language
subtags), not languages.

This is both true and untrue. The new language registries subtags and extensions have full autonomy in their area, while the former langtag registries was not much used (72 entries in 10 years). The capacity of the new registry is important (440 languages, 100 scripts, 250 country codes which can be organised together to build thousands of langtags). The capacity of extensions is limitless. Technology will reasticly support sociolects and idiolects, meaning billions of tags.

It is also correct to say that the IANA just registers language subtags and not languages. This means labels used to build the designation of a language. This means that if you cannot tag (name with the RFC 3066 Bis format) a language, that language just do not exist in the digital system using that tagging. This may no be of importance in an obscure local application, this is not the same for the whole Internet.

>is not an Internet issue,

RFC 1766, RFC 3066, as well as its approved successors
(draft-ietf-ltru-registry, draft-ietf-ltru-initial and
draft-ietf-ltru-matching) only deal with language tags
on the Internet. It is difficult to understand how language
tagging on the Internet would not be an Internet issue.

Domain Names are an Internet issue. IP addresses also are. Their concept originates in the Internet. Languages concepts do not originate in the Internet technology. Language Tags permit the Internet technology to interface the Language reality. The importance of the Internet in the world life make a conflict between the Language reality and the Internet Language support a major political, societal and economic conflict.

The point is to know who is the master and who is the slave: the man or the machine. The IETF or the people. Should the RFC adapt to users or users to RFCs. With in background the fact that if people are to adapt to RFC, they will have to adapt to the concepts and interests of those who wrote the RFC. RFC 3935 answers that point: people are to be influenced by the IETF in the way they design, use and manage the Internet for the Internet to work better.

I accept that in a technical vision of the world you can think this is a good thing. I acknowledge that you may/can want to develop it. However, I cannot support you there: I have a significant (quasi universal) different vision. I serve the users rather than influencing them. Your vision is exclusive and wants to exclude mine (we saw it). Mine is to support everyone, including you - this is why I needed you to clearly define the way your system is to work.

>is far more important and sensible than names and numbers,

I wouldn't be co-chair of the LTRU WG if I wouldn't believe
that language tagging is important, but there are far more
important issues (it's e.g. easy to show that 'charset'
tagging is much more important than language tagging,
because the consequences of failures are much greater).

I am afraid you are trapped by your own conceptions and strategy. Language tagging or charsets are technical concepts. Reality is made of languages, graphemes, phonemes, etc. people, cultures, history, countries, etc. What you discuss here is related to the limitation of your concepts. You just tell that Language Tags (which are the IETF interface to languages) should consider charsets before scripts.

You may remember that you opposed this I explained.

We both know the reason why Unicode chose ISO 15924 and scripts rather than keyboards and charsets. And that reason is not technical. I do not share that reasons. I do not use ISO 15924 except for what it is: a list of tags you can use to qualify charsets.

Also, I agree that language tagging occasionally can be
a sensible issue (a look at the ietf-languages@xxxxxxxx
mailing list would definitely give that impression), but
by and large, most language tags are used in practice
without any problems.

This is a ... premature affirmation. Up to now there were 72 IANA language tags and a lose usage of ISO 915-1/2.

What is at stake is the e-cataloguing (with the impact resulting from the importance of the Internet in the world's life) of what is the most sensitive human resource. It affects human identity, cultural development, national sovereignty, and more and more is identified as probably the most important economic resource/way of influence of the future.

We saw it with the denial of "en-eu" which most probably decided of the split between the IETF Internationalized Internet and the IGF Multilingual Internet. I am afraid that the practice is not "without any problem", it is just that you do not seem to see the problems you create.

>and is de facto [this is what I object] delegated to UNICODE.

It's difficult to object to something that isn't the case.
The language subtag registry is de facto delegated to ISO
(for language codes, country codes, and script codes) because
the IANA registry (except for blunders by ISO that we hope
they won't make anymore) just reflects the relevant ISO
standards.

Dear Martin, we both know that the "blunders by ISO" are when ISO and Unicode disagreed and that the "we" you use are the members of Unicode. http://www.unicode.org/consortium/memblogo.html - http://www.unicode.org/consortium/directors.html .

Of the above three kinds of codes, language codes
are obviously the most important (no language tag without a
language code), and script codes are the least important
(most language tags don't need a script code). The Unicode
consortium is designated as the for registration authority
for script codes. But this doesn't mean that they can assign
new script codes at will; ISO 15924 (see e.g.
http://www.unicode.org/iso15924/standard/) describes that
new codes need at least four positive votes from the six
voting members of the Joint Advisory Committee. Only one of
these members is from the Registration Authority (Unicode),
all the others are from other, ISO-related, organizations.

The problem is that you probably co-wrote the Charter of the WG-LTRU and did not ask your WG to read it before starting a clean sheet work. We both know that what Unicode brings to the IETF is its globalization vision, doctrine, and possible practical management.

I do not object that the IETF delegates its language doctrine, strategy, and eventual control to Unicode and to who may be a leader in the Unicode consortium. I object that this is done through a distortion of the RFC 3066 Bis rules and not as a formal MoU as in the ICANN case for numbers and names. It should be clear that the IETF follows the Unicode vision or is not limited to the Unicode globalization. My appeals to IESG and further on to the IAB are just to obtain that answer. So we know how to finalise http://bcp47.org.

>The IETF is obviously not prepared to this kind of fundamental conflict.

In order to talk about whether the IETF is prepared for a certain
kind of conflict, we first would need to know what kind of conflict
this is. But I can't find any fundamental conflict in the paragraph
above.

My post was related to conflict resolution, not to this conflict in particular. This conflict is the effort engaged for the control of the IANA. It is mainly perceived in the USA as a "tuning" between ICANN and industry stakeholders, where the IANA "new registries" can be a trump card for some. It is going to be a fundamental architectural issue for the rest of the world between a centralised and a distributed approach of the Internet metadata.

The main issue is over IRI-tags, ISO 11179, and the need for the IETF to have a multilingual distributed referential/registry system doctrine and solution. This is a global architectural evolution.

>5. IETF strategy. There are cases where a possible solution is a significant change of the IETF, or even to kill the IETF itself. The conflict I am engaged into, is certainly of that nature. RFC 3935 gives "IETF leaders" the capacity to address such situations, except when the opposed option is defended by one/several IETF leaders. We should not consider that such conflicts are exceptional: the lack of architectural guidance by the IAB raises several other issues. After the Multilingual Internet, what about the multilateral, the multitechnology, etc. support?

There are two ways to understand "Multilingual Internet" above.
One is that the Internet is already to the most part multilingual:
There are Web pages in a large number of langugages, emails are sent
around daily in a similar number of languages, and so on, and some
of the remaining issues, mostly in the area of identifiers, are either
on the verge of being fully deployied (IDN) or at least work has started
(internationalized email addresses).

It is surprising to see such globalization patches qualified as multilingual somethings. IDNs are internationalized domain names and certainly are neither multilingual nor on the verge of being fully deployed (unless you talk of the Chinese Names?). I would be very interested to know where internationalized (not multilingual) LHS is presently under discussion.

The other way to understand "Multilingual Internet" is that the
"Multilingual Internet" is something completely different from what
we have now, much more multilingual for the end user, or whatever.
But while we have heard much buzzwording about that, we haven't
seen any of that in any actual kind, shape, or form, nor have we
actually been told what it's going to look like, or how it's going
to be better than what we have now (see previous paragraph).
So it's vaporware even by the standards of vaporware.

Multilingual Internet is a demand, therefore a specification. It is up to you to document they way you intend to support an Internet IETF multilingual architecture. It is true that until now the pretence that IETF internationalization fits the job and the IAB denial to use the word multilingual is vaporware.

I am one of those working on a Multilingual Internet architecture. I am not alone :-) . But I am certainly the only one who think that it can only be an evolution in usage, and who therefore care about immediate interoperability. You just spent two years refusing me that interoperability. Actually, I think that this multitechnology, multilateral, multinational, etc. interoperability is most of the Multilingual Internet.

A similar analysis can be made for "multilateral" and "multitechnology"
above. Of course the Internet is "multilateral", it allows multiple
parties to communicate with each other. Of course it is "multitechnology",
on many levels (from the physical and link layer up to the applications
layer).

I feel you give the clue. You say "of course the Internet is ...". Instead of "how can we help people to make their Internet ...".
Take care.
jfc




_______________________________________________

Ietf@xxxxxxxx
https://www1.ietf.org/mailman/listinfo/ietf

[Index of Archives]     [IETF Annoucements]     [IETF]     [IP Storage]     [Yosemite News]     [Linux SCTP]     [Linux Newbies]     [Fedora Users]