Re: [Ltru] RE: STD (was: Last Call: 'Tags for Identifying Languages'toBCP)

r&d afrac <rd@xxxxxxxxx> · Mon, 29 Aug 2005 07:21:06 +0200

I am sorry to impose again the community, what starts amounting to
ad-hominems.

Please, Brian advise if inadequate. 

At 04:26 29/08/2005, Peter Constable wrote:

> From: JFC (Jefsey) Morfin
[
mailto:jefsey@xxxxxxxxxx]

> The

> proposed langtag is an arbitrary limited compound of three

> information: language name, script and country. A language

> identification MAY call for far more elements, and deliver much
more

> information.

Mr. Morfin has often suggested to the LTRU WG that language tags
should

be able to provide greater information than is allowed by the draft.
He

has never provided any specific proposal except a request to permit

certain private-use tags, which I will return to below.

Dear Peter,

This kind of repetition now abuse no one. I bored everyone enough in
explaining that two additional subtags were necessary IMHO: the referent
and the context. There is also - a way or another the need of the date of
the reference (this can be a date or included in a subtag). 

This is documented at length in a mail of mine today. I will not repeat
it. I will only suggest you study Word.

The consensus of

the remainder of the LTRU WG is that the draft supports all relevant

distinctions needed to describe the linguistic and written-form

attributes of content as may be needed for all purposes, commercial
and

otherwise.

This is an historic statement I hope no one will forget. 

Every searcher and engineer knows the value of such final
"all".

Just in case: the langtag is not supposed to only support the
written-form attributes, but to be multimodal (cf. Peter Constable).

Please quote the voice, signs, icons, mood, etc. subtags.

> This means that:

> - "fr-Latn-fr" is the default tag based upon ISO
639-1/2/3

> - "x-fran" is a private use tag based upon ISO 639-6

> - "0-jefsey.com:franver" is my vision of the French at the
Palace of

> Versailles. Documented by an ISO 11179 conformant system (see
below)

Two comments: First, Mr. Morfin suggested within the LTRU WG that
the

syntax for language tags should be loosened to permit additional

characters, such as "." and ":".

This is a false affirmation. I did two things:

- benefiting from the marvelous capacity to direct the WG-ltru decisions
in proposing the necessary opposite, I made sure the ABNF would be fool
proof (this is not yet exactly the case as they did not always find the
proper [cf. Peter] "constraints".

- I supported the proposition of an African searcher (they treated of
troll) to reconcile the desire of a strict ABNF expressed by the WG
affinity group and the users, R&D and innovation (following ISO
evolution) support to use the URI-tags RFC in proposing first to use the
"private use" area. As indicated, a remark shown me it was a
wrong choice, the private use area also addressing other needs.

I then came to the conclusion that using the present Draft as a default
non exclusive solution, and some reserved numeric "singleton"
as the hooks for URI-tags was preserving the work made by the WG, while
addressing the needs of the rest of the world, avoiding an unnecessary
conflict.

The remainder of the WG was
in

consensus that this was unacceptable due to backward incompatibility

with processes designed to conform to RFC 3066.

Secondly, Mr. Morfin has repeatedly made mention of ISO 11179, a
series

of ISO standards on metadata and metadata registries, indicating his

view that language tags used on the Internet should be maintained in
a

registry conformant with ISO 11179, and therefore that the draft
should

make reference to those standards. He has also, on several occasions

such as his comments above, cited ISO 11179 in relation to his views
in

a manner that appears to be intended to suggest that his views are

superior to the draft because he has cited that series of standards

while the draft does not. 

The Draft addresses targets you defined a long ago. It was presented
privately (twice) and is now presented as a WG document.  The
document having not changed, one can expect that it keeps the same
targets. You consider it addresses them "all". 

There can therefore be no "superior" views. There are different
targets. My target is protect the R&D, users, and Internet
innovation. 

In a nutshell, I do _not_ believe that a draft crafted by a few
individuals can supports all the relevant distinctions needed to describe
the linguistic and written-form attributes of content as may be needed
for all purposes, commercial and otherwise. And I want to protect other
searchers and cultures' right to have their own solutions,
_without_conflict_ and detriment to _your_solution_.

The real solution is IRI-tags we will document as soon as the URI-tags
RFC is published. But that will create a deployment conflict with your
application, due to your sponsors. No one needs that.

A reality check is in need
here:

- While Mr. Morfin cites ISO 11179, he has never made statements 

  that clearly indicate that he actually understands those
standards.

I propose everyone having time to spend to read ISO 11179 and to judge.

In a recent mail, Peter acknowledged the need to consider ISO 11179 and
explained that ISO 12620 was its equivalent. May be the difference
between an engineering and a literary approach ...

One may note that I just proposed the IETF initiates a WG in the ISO
11179 area. The reason why is that ISO 11179 has not yet engaged the
networking aspects. The work we carry on CRC (common reference centers)
gives a vision of interest. I often compare ISO 11179 to X.500 and the
work to be carried to LDAP. The importance to the internet architectural
development should not be overlooked.

I currently try to gather the necessary funding for a French AFNOR budget
on the matter. 

- While Mr. Morfin refers to
"an ISO 11179 conformant system", 

  none of the ISO 11179 series of standards contains any statement

  of conformance requirements. Thus, no such notion of "ISO
11179 

  conformant" is defined anywhere.

:-) :-)

This is the second Historic statement! 

Too bad there is Google ....

http://www.google.fr/url?sa=t&ct=res&cd=6&url="">

http://www.schemas-forum.org/registry/desire/activityreports.php3?field=filename&value=JTCI_SC32_D29D35(RDF).rtf

"WG 2 intends to recommend using XML for accessing and
interchanging information in 11179 conformant
data registries. They expect that specific XML tags and data
structure will be algorithmically derived form the normative UML data
model specified in 11179 part 3. The Object Management Group (OMG) has
already adopted a standard for XMI (XML Model Interchange), which they
expect to recommend as one mechanism for such algorithmic derivation of
XML representation from UML models. Work is also underway to foster
interoperation between ISO/IEC 11179 metadata registries, XML registries,
Universal Description Discovery and Integration (UDDI) registries,
database catalogs, ontology registries and CASE tool repositories. The Sc
32 work is positioned to meet deeper semantic management aspects of data
management and interchange. WG 2 has already initiated electronic Working
group meetings to progress its program as quickly and efficiently as
possible."

http://www.jtc1sc34.org/repository/0346.htm

WG 2 intends to recommend using XML for accessing and interchanging
information in 11179 conformant data
registries. 

http://www.google.fr/url?sa=t&ct=res&cd=24&url="">

www.ncess.ac.uk/events/conference/programme/presentations/ncess2005_gillam.pdf

etc.

  All that can be
said is that a 

  system of metadata elements is maintained and administered using

  a certain amount of the conceptual model, practice and 

  administrative infrastructure specified in the ISO 11179
standards. 

  The draft uses some measure of these, though it does not make

  normative reference to ISO 11179.

This certainly explains the confusion with ISO 12620.

  In terms of ISO 11179
notions, each entry in the proposed registry 

  includes the two essential components of a metadata element: a

  representation, and a data element concept. Each item in the 

  registry indicates (i) the representation used in language tags,

  (ii) a designator that indicates the value meaning and that can

  also serve as the data identifier, (iii) the object class (its

  "type"), (iv) the administrative status (limited to
deprecated or 

  not deprecated), as well as other properties.

A simplified vision, as noted in a previous mail, is C structures. Where
a name can designate a value or another structure. What is interesting in
a network context is that one can add a "scheme" to the
structure as the URL/IP address of the registry (URI Tags). This means
that two people can build the same registry description and links, and
yet they are different registry systems.

One simple application (but this is general) is to consider a language
description registry root (lang root) using ISO 639-6. If instead of
using entity names (for example "engl") I use the entity ID
(for example an IPv6 interface ID) I can associate to the same base
thousands of namelists, one for each language. At no cost.

We can also keep the IPv6 ID grid and to port it under another IPv6
address for another language. And we can replicate the documentation of
the language in various languages. Default to other languages when an
information is missing is easy, since we play only on IPv6 addresses with
the same Identifier ID. Flexibility is total and filtering/equivalence
rules can be stored as one of the data file. 

The interest is that the same Item ID can be used as pointers in a local
database [referents] (either loaded via a CD, or cached). We can build a
local vision of a language and related information, according to personal
rules. This means that we can have billions of ISO 11179 conformant
descriptions based on ISO 639-6 names/IDs [context]. And to dynamically
update them. 

The initial database interest is that on can allocate IDs to the subtags
and to langtags. But this is not documented and the work is huge (ISO
639-6 is to provide them).

Now, a langtag including the referent and the context will support
interintelligibility between people the way want. Supported by an OPES
people may even relate in "language" they do not know. But what
is a language? 

  Thus, while it cannot
formally be said that the draft conforms

  to ISO 11179 (since no terms of conformance are defined), I think

  it *can* reasonably be said that the draft creates a registry
and

  system of metadata elements that is consistent with the model

  presented in ISO 11179.

ISO 12620 understanding. Confusion resulting from the Varsaw meeting. ISO
639-6 can translate some minor (in term of importance) in terms from ISO
12620. That's all.

- The primary reason that the
LTRU WG chose not to reference ISO

  11179 in this draft had nothing to do with whether the WG 

  considered ISO 11179 appropriate or valuable in general.

Thank you for confirming that we are not interested in the same
area.

Then I can only say "keep clear". Play in your own field, we
will help (I made sure you ABNF is quite proof). But leave others to
address their own concerns.

Rather,

  it was that it was not deemed that reference to ISO 11179
would

  add significant value in the context of an IETF language
subtag

  registry. Taken together, the ISO 11179 standards are long
and

  complex, and have not to our knowledge been referenced in any

  other IETF metadata registry

This is why we have to create a WG on that area. But may be premature?

 -- and certainly not in
relation

  to RFC 1766 or RFC 3066, which specifications accomplish their

  purposes in spite of that absence of reference.

Thus, when I see Mr. Morfin citing ISO 11179 in the course of
arguing

for some view that he holds, I consider that citation to have added

nothing of significance in support of his view.

see above.

> This means that this debate
is only to lock a _final_ ABNF via an

> accepted RFC and a loaded operationalIANA registry _before_ a
simpler

> solution [ISO 639-6] is available three months from now....

This statement makes several assumptions of uncertain validity, not
the

least of which is that use of alpha-4 symbols from ISO 639-6 for
IETF

language tags would constitute a simpler solution. 

You do not need to sell your solution. I explained again and again I
support it. 

But do not say that it addresses my and other people's needs. It cannot
be exclusive and exclude us all.

Given the widespread

existing use of RFC 3066 tags, use of ISO 639-6 would have to go

alongside use of multi-part tags of the form permitted by RFC 3066,

which is certainly not simpler than what is specified in the
draft.

Draft centric assumption. Peter, your Draft is not the center of the
world. The user is. Simplicity is not according to your ABNF. Simplicity
is according to the user with her needs. You think (and it may very well
be) that your solution is simpler for you. I even accept that in some
cases it may also be for me. But your solution does not scale.

You have the same langtag capacity to document the English language of
Pitcairn and of the USA.

You miss half the existing scripts, do not cover founts and do not
document anything of voice and signs.

Your proposition is not able to be multilingual,

You do not know what is a language in your context (and this is not easy)
....

etc. etc.

> >Your statement doesn't
contradict anything that Debbie has said,

> >provided the context is ISO 639-6 alone. If we were to talk
about

> >incorporation of ISO 639-6 into a revision of RFC 3066, however,
then

> >duplication would become an issue for consideration.

> 

> This is the WG-ltru Charter that all the ISO codes be
included.

The charter makes reference to "the underlying ISO standards";
that is,

to the ISO standards referenced in RFC 3066 or those cited in the

charter to be incorporated into the update RFC. The charter does not

cite ISO 639-6, let alone state that "all the ISO codes be
included".

Having considered the old failed Draft instead of the Charter did not
help ....

http://ietf.org/html.charters/ltru-charter.html

"It is also expected to provide mechanisms to support the
evolution 

of the underlying ISO standards, in particular ISO 639-3".

How do you read "evolution"? As far as I am concerned, we want
to use, help, benefit from, etc. that evolution and do not want you to
block us. "US" being all of us, and in particular my own
team.

> Nice to see that ISO 11179
is accepted now. Peter Constable and the

> WG-ltru have opposed the reference to ISO 11179 model. This
model

> permits to conceptualise languages and to include in their

> description an unlimited number of additional elements.

This is in no way implied by ISO 11179. The model of that standard

assumes that metadata elements designate concepts within some
conceptual

system, and that the system of metadata elements includes a
meta-model

that reflects that conceptual system. This would have the effect of

*constraining* the concepts represented to entities within that

conceptual model. Those entities may be an infinite set, but the set
of

entities that can be represented by the tags defined by this draft
would

not increase in number if the draft were changed to reference ISO
11179.

You seem now to want to tag your langtags with "ISO 11179"
(soon we will learn they are "ISO 11179 inside").
Good!

But this means you will have to _change_ your draft because the set of
entities that can be represented by the tags defined by this draft
will  dramatically increase in number and in related information
.... Once you have done it, we will probably say the same.

> But ISO 11179 totally open
the concept...

Clearly either Mr. Morfin does not understand ISO 11179 or, if he
does,

he has totally failed to express a statement consistent with that

understanding.

At this stage the reader has probably set-up his/her opinion.

> I would then advise that
the Draft is sent back to the WG-ltru, with

> the suggestion that a lexicon is provided which would define what
is

> a "language", a "script", a "country",
and the purpose (informative,

> descriptive, normative?) of a langtag. This might be a big step
ahead.

Mr. Morfin submitted a request to the WG that these terms be
defined.

The consensus of everyone else in the WG was that this was not
necessary

since it would not significantly alter the ability of anyone to

implement or use the specification.

May I just quote your response in another mail ....

<quote>

> I agree that the broad
question of "what is a language" is out of our

> scope. The more specific question "what is a taggable
language

> distinction" is perhaps more germane.

Not an unreasonable suggestion.

</quote>

Cheers....

jfc

_______________________________________________

Ietf@xxxxxxxx
https://www1.ietf.org/mailman/listinfo/ietf