Dear all,
at this stage I think it is clear that the langtags issue represents
a strong opposition between two visions of the Multilingual Internet.
These visions for the worse or the better are embodied by Peter
Constable's friends and me.
There is an affinity group gathered by circumstances or by talent to
support Peter's approach. Its kernel happens to be formed by English
mother-tongue people employed by large corporations or interests
(from history it seems it formed in the course of international
meetings). A few Members are included by personal dedication or as
consultant. There are no academic searcher, no publicly funded
contributing project, no cultural organisation sponsoring. The
Members of this affinity group share a comon culture. It is based
upon different levels of technical involvement of the structures and
individuals involved. There is no R&D involved in the network area
which is not sponsored by commercial interests, with the con and pro
meaning of RFC 3869. In that sense it can be said it is an US
industry lead group. This is at least the way non-US interest,
organisations, Government officials I discussed with identify them
with no exception. True or not, this is the perception. It is to be
related to the definition of an IETF affinity group be RFC 3774.
This group proposes a tagging of all the languages of the world, it
perceives as a commondity (a well known trait of the English mother
tongue people who share their own language with other people round
the world). This way certainly suits e-commerce and basic
interoperability and library classification of foreign books. The
idea is that a standard and a central registry will constrain the
world to follow a common useful rule, if it cannot continue using
ASCII English. This is named "internationalisation". This unliteral
standardisation is seen as the only warranty of stability and of
unicity of the network. Being unique for the entire world this
tagging must be simple and based upon simple information. This
information is made of three elements the commerce needs for
practical reasons: the written language, the script being used, the
applying law.
This vision A addresses specific urgent needs of the printing and
libraries industries to reduce costs to face the competition of other
media and the printing capacity of every user (a problem less
documented but as important as the Music industry'sproblem), with a
larger financial turn-over. World concentrations and specialisations
can be expected from a unique normative system. With all the
reluctances one can be expected and the strategy one may imagine).
There is a tissue of relations I weaved among people engaged in
network research, operations management, cultural life, government
administration, international entities, lingual oriented interests
and activities, and local industry, from various parts of the world,
in particular through an Internet test-bed named dot-root (responding
to the ICANN ICP-3 call), a long involevement in @large and ccTLDs,
and from an national internet community and governement think tank I
started one year ago and which develops unexpectedly. The strength of
this relational group is that no money is engaged, what warranties
its independance. But this is also its weakness as it leaves it no
other alternative than to rely on voluntaries to represent it - often
only one when the task is as demanding as this one; or to call on the
personal involvement of concerned people, with the risk of
overwhelming the Internet standard process by scores of irritate new
commers. The common culture of this group is common sense support
towards a user-centric multilingual architecture and strong
sustainable innovation
This group sees no need to tag the languages but the need to document
relations, which - among other things - use languages, but also many
other parameters. It thinks that every human being, machine and
service is specific and different from other, and that surety,
security, stability and innovation capacity is based upon the best
seamless support of these differences for a strong unity of the
network. It experimented that the computing generalisation and a
pervasive networking support a realistic, commercially rewarding and
humanly exciting set of possibilities. This concerns relations,
culture, economy, social, political development everyone, every
economy, every country may share in, on an equal opportunity basis.
It also sees a global convergence of R&D, civil society, economy and
political spheres in that direction (for example at WSIS, but also at
IETF) expressed in various directions, one being the information
conceptual networking (ISO 11179 R&D) and another a fluid refencing
system (URI tags) which give new possibilities; specially when added
to physical and services networking.
This vision B calls for an open description system/language of
languages, and of many other relational parameters. Obviously it is
still in infancy as everything started in the early 80s has been
delayed by the furthor OSI and then Internet vision, hardware and
bandwidth limitations and costs. It is only resuming now.
The vision A has difficulty (and lack of competence Peter helped
documenting yesterday) to understand vision B. And as usal in that
cases it fights the messenger. No big deal: the messenger is used to it.
Vision B has no problem in accepting vision A as a "default" for
those wanting it. However vision A is centralised and vision B is
distributed. Xo, vision A thinks it needs to be unique to exist and
fullfil its purpose. This is why Vision B proposed several things:
- to define a Vision A exclusive area of application. This was made
from the second Last Call in proposing the authors to add wording
telling that the area of application was the areas already covered
by RFC 3066 and documented further on.
- to protect Vision A from confusion. This was made in pushing the
authors into a very strict ABNF avoiding tag-creeps.
- there may be other propositions to sudy. This is however not easy
to uncover as Vision A has difficulty with the architectural
evolutions (network, content, relational elements) all this
technically implies.
As I explained, there are three scenarii:
1. Vision A is denied by the IESG. Progressively vision B imposes
itself through new RFCs or from a grassroots (international) process.
The current basic needs are not properly addressed. Credibility of
the IETF is engaged like in spam, IDNA, etc. This is delaying.
2. Vision B is denied by the IESG. But vision B is already accepted
through the URI-tags RFC. It will develop in opposition to Vision A.
This will cost money and delays to everyone, Multilingual Internet
will switch outside of the IETF or balkanise.
3. Vision B is included in Vision A as a community private use. This
scheme is simple to understand and to include in the RFC 3066 Bis
document in two lines. It does not break any of its principles.
- the document is unchanged and addresses the general need,
whatever it may be.
- "x-" is unchanged. Its role is to support private use schemes,
within private spaces.
- "0-" is added from the reserved singleton pool. Its role is to
support community private use schemes. This means, when a user
community wants to document languages their own way. The need is to
support in a non conflicting ways two informations:
- the community scheme identification
- the identification within that scheme.
I think this respect all the requirements of Vision A and permits a
full developement of Vision B. There are two possibilities to support
the "0-" space: either to develop a new system or to use an existing system.
I have no particular opinion except that the solution MUST be
decentralised (community centralised). I started thinking we had to
develop a new one, waiting for tge review of the WG-ltru charter both
to make sure the proposition would fully respect Vision A and to
learn Vision B points we would have overlooked (there probably are
many). This created problems to the WG wich only wanted to block
Vision B it still does not uinderstand or opposes.
Then we found the not yet numbered URI-tag RFC. It seems to address
all the needs, but more than the needs, except the
multilingualisation. My intent is therefore to document an IRI-tag
along the URI-tag lines when this debate has stabilised and the
URI-tag RFC has been published. I have no problem working on it
within the WH-ltru.
What next? The Vision A alone is harmfull to all. If it was accepted
it would be appealed. To IETF Chair for common architectural common
sense. To IESG for lack of compatibility with the Charter and other
RFCs. To IAB if necessary to obtain guidance on the implementation of
the Multilingual Internet. Then appeals would continue in the outside
world. The target is not to oppose the Vision A. It is to the
contrary to make sure it is viable. As the only solution permitted,
it will NOT survive because it is not able to resist all what one can
expect people will do with it out of control. We had a very similar
case with IDNA. The only response to hommograph phishing was "we
discussed it"....
I will document a few of these points in responding last Peter's mail.
At 14:11 29/08/2005, Peter Constable wrote:
> From: Bruce Lilly <blilly@xxxxxxxxx>
> > This
> > is all what this proposition is about. This proposition is to give
> > _one_shot_ in a _standardised_ way the language, the script and the
> > country.
>
> This was discussed during Last Call of the previous non-IETF
(individual
> submission) attempt. IIRC David Singer brought up several examples of
> other pieces of information (e.g. legal/copyright variations) that
could
> also be negotiated and which might affect the presentation of content
(or
> choice among alternative content). Lumping all of these separate
items
> into
> one tag is a poor design as it impedes negotiation and tends toward
> lengthy
> tags which are incompatible with fixed-length mechanisms such as MIME
> encoded-words.
I agree that it would be poor design to incorporate other pieces of
information such as legal/copyright variations into language tags, but
as such pieces of information are not supported by the draft, this
appears to be irrelevant.
This is inexact. There is no problem in having the Draft compliant tag:
fr-Latn-fr-gayssot
to indicate a French language text fully respecting the "Loi
Gayssot", the anti-racist law used against Yahoo. There is no
warranty that an ISP or the French law does not filter out pages from
suspected sites not wearing that tag, transfering Host legal
responsibilities to the Author.
The problem in believing that one can rule the world is that the
world may not accept to be ruled.
We should rather focus on whether it is good design to incorporate
information related to linguistic and written-form attributes, as
supported in the draft, into a single tag. The consensus of the LTRU
working group is that it is.
Let phrase it a more exact way: the affinity group which formed the
WG has been gathered around that idea.
1. basic written mode attributes should not be specific in the
description of a language ... while in addition most of them are oral
2. in what manner the country code is related to a specific
information? Nowhere in the Draft this attribute is documented: is it
the location where the text has been written, the location of the
lingual community of the author, or of the lingual community of the
reader ??? Where is that location definition documented so both side
of the relation can understand each other when negociating?
For instance, the use of separate tags for
language and script were considered and rejected
this has not been considered and rejected. This was a predefined
faith and every question on this has been defeated.
The problem is that it is meaningless and conflicting with the charset!!!
Until you associate a "script" with a charset, a script has no meaning ....
I asked the simple question: "does fr-Latn-FR means that Latn permits
me to properly write French?" To know that, I need to know what are
the characters associated to "Latn". No response. Same question on
the Unicode list. Non-French mother tongue members said "yes" (but no
one was able to demonstrate it). French mother tongue experts said
"no" and explained that Unicode lacks a particular space needed to
properly type typical French sentences an one accentuated character.
This was then disputed. My problem as a user, as a network
standardiser is not to be concerned by these details. I need
certitudes and warranties the Draft does not provide.
on the basis that the two are not entirely orthogonal. Clear
examples of this was considered:
while the intent of
Accept-Language: ar, az-Cyrl, ru
is clear, the intent of
Accept-Language: ar, az, ru
Accept-Script: Cyrl
or of
Accept-Language: ar, az, ru
Accept-Script: Arab, Cyrl
is not clear, nor is it obvious how rules could be specified that would
make the intent clear, or that would permit expressing the preferences
reflected in the first instance.
This kind of example is absurd. There is no more information and more
confusion with the proposed system if a page or a part of a document
is also assigne different conflicting langtags ...
> Tagging identifies characteristics of a particular piece of content.
For
> that purpose alone, it makes little difference (other than regarding
the
> aforementioned compatibility issues with existing IETF mechanisms)
whether
> the characteristics are lumped or separate.
On the contrary, it makes little difference only if the characteristics
in question are completely orthogonal. As pointed out above, the
characteristics of linguistic variety and written form are not
orthogonal, particularly when it comes to expressing user preferences,
and that it *does* make a difference if they are split into separate
metadata attributes or they are lumped together into a single metadata
attribute.
Explain.
I will go your way however you have not defined what is a script. The
author is a Rusian, siting in NY and writing a page in Urkainian and
wanting the texts to be repeated in Latn and Cyrl scripts, so
everyone there is able to read it. A very common proposition.
Please precisely document the langtags. And show what is not
orthogonal in them.
> While that may be used to infer something about the content
> provider, such inferences may be unreliable...
Quite so. This point was discussed in the WG.
The question is to know if the solution is acceptable. This LC is the
LC of the document, not the of the WG or mine;
> Negotiation of separate characteristics is much
> simpler than that of a combined conflation of characteristics; each
> characteristic can be assigned separate preference values, and
irrelevant
> characteristics (e.g. script w.r.t. spoken language) can be easily
ignored.
Negotiation of separate attributes involving inter-related
characteristics is *not* simpler, as pointed out above. The draft fully
allows for irrelevant characteristics (e.g. script wrt audio content) to
be ignored. Again, what has been provided in the draft is in accordance
with the charter of the WG.
Charter speaks of languages. You made clear the Draft was language
and not written language oriented. I am glad to learn that the mode
is an irrelevant characteristic.
Most of the languages are oral. Their rendering in a written form is
therefore a important information ...
> As negotiation and related issues represent a critical technical issue
for
> the design of language tags (viz. keeping separate characteristics out
of
> *language* tags), it is essential that such negotiation issues be
> considered
> carefully before specifying the format of tags. Unfortunately, that
has
> not
> been done, and considering the published WG milestones it appears that
> that
> issue has not been taken into consideration... However, it
> appears that the WG has not considered the issues, with the effect
that
> the
> WG product lacks the "particular care" expected of BCP documents (RFC
> 2026).
It is unclear on what basis it is asserted that these issues have not
been considered by the WG. I believe most of the WG members would feel
that they have been reasonably taken into consideration.
I agree with that. But, the question is where was the related
decisions taken. I would tend then to fully agree with Bruce.
> Note that it is not the registration procedural issues that are
typical of
> BCP documents that are problematic; rather it is the conflation of
> separate
> characteristics into a single tag syntax, specified in the same
document,
> which raises problems related to content negotiation.
Bruce asserts (a) that there is conflation of separate characteristics,
and that (b) this creates problems in content negotiation. The WG
determined that the characteristics conflated into a single tag are not
independent, and that it would be *separation* into separate attributes
that would result in problems in content negotiation, not their
combination into a single attribute.
Govermental authority over content is not an orthogonal information
to language in some parts of the world. Question is to know if this
is to be addressed as a general or a specific issue.
> Another large part of
> the problem is WG management; in addition to the issues raised by John
> Klensin the last time that LTRU participation was discussed on the
IETF
> discussion list -- and with which I wholeheartedly agree -- it appears
> that
> management of WG participant conduct has been rather lax; proponents
of
> the
> individual submission effort who are participating in the WG tend to
> resort
> to ad-hominem attacks when a problem is identified or when an
alternative
> approach is raised, with no visible intervention by the WG co-chairs.
> That
> has also (i.e. in addition to the factors which John identified) had
the
> effect of limiting WG participation by individuals.
It's unclear what bearing this has on what improvements can be made to
the drafts in fulfillment of the WG charter. I believe several WG
participants felt that management of conduct was lax, particularly in
relation to a very small number of participants with a penchant for
certain behaviours that would have challenged the best of moderators.
I suffered most of that: various innuendo on my age, my need of
English teachers, the despise of my colleagues as "end users" vs.
"IETF members" and "developers", "physical allusions to my possible
broken nose", anonymous phone calls, loss of clients due to abusive
mails they read under partners coporate name, accusations of
ignorance by ... documented ignorant, rumours, etc.
I agree that one of the moderator actively engaged in that process.
But these are the risks of opposing big interests. When it went too
far, I appealed to the AD. The problem was corrected in minutes. The
AD decided to pursue the appeal and ruled in a good way for the
stability of the WG. It is true that from then on, insults against me
did not result anymore in banning or warning or insulting me.
We all are grown boys. I am in that kind of business for nearly 30
years. I saw worse :-) (but usually more competent). I invited
without problem all my opponents to have a drink in Paris (but none
came to the IETF meeting, or told me). It would have been nice.
As for the accusation that proponents of an earlier individual
submission engaged in ad-hominem attacks that went without intervention
by the WG co-chairs, resulting in the limitation of participation in the
WG by other individuals, in the absence of specific evidence,
Please refer yourself to the mailing list. However, this is not a
Last Call of the WG management, but a Last Call of the Document. The
reasons why the document is incomplete should not be discussed so
much, just what is missing or to correct.
But it is true that several have been rebuked by the attitude of the
authors. I would say that this was evaluated very early. And that the
debate is better served when people overcome this. One judges a tree
to its fruits. The deliverable is not perfect: this is what matters today.
this
appears itself to be no more than an ad-hominem attack on those
individuals and on the WG co-chairs. To my knowledge, there was only one
individual in relation to whom other members of the WG acted in any way
that might discourage or hinder his participation,
Two disclosed. Two implied. This is mostly because I accepted to
represent others. But what would have been the use of making the WG a
battle field? This is what the author wanted so the "best" would
"win". This is not my vision of the IETF.
and such actions
arose only in response to repeated provocation from that individual
archives are here.
> Specification of "language" tag syntax which conflates other content
> characteristics prior to open and professional discussion of
negotiation
> issues and alternative approaches would be a premature lock-in of a
design
> choice. As the document under discussion specifies a conflation of
such
> characteristics without open discussion
It is asserted that there has been no open discussion of the matter of
conflation. This is untrue. It is asserted that there has been no open
discussion of alternatives; the only concrete alternative presented for
discussion was to have separate language and script tags, which
alternative was considered and rejected due to problems that arise in
content negotiation. The drafts submitted for review are in accordance
with the charter, and I believe I can say that in the opinion of WG
members matters of conflation and of negotiation issues were taken into
consideration, and were discussed in an open and professional manner.
total disagreement on the outcome so far. But I hope we can overcome
that with the help of the IETF/IESG.
A lot of things have already changed in what some say ....
jfc
_______________________________________________
Ietf@xxxxxxxx
https://www1.ietf.org/mailman/listinfo/ietf