> Date: 2005-08-25 20:55 > From: "JFC (Jefsey) Morfin" <jefsey@xxxxxxxxxx> > the privacy problem is the "what you read, who you are" intelligence > leak. That is to some extent true of any negotiation mechanism and negotiated value. > Today langtags are not yet much used (say the W3C people in the > WG-ltru) when compared with what they should in XML, HTML, etc. XML, HTML, etc. are not IETF protocols and should not be the main consideration in IETF work on IETF documents, especially as language tags are heavily used by IETF protocols, notably MIME (RFCs 2045, 2047, 2231, 3282) and widely-deployed core IETF application protocols which use MIME (e.g. the Internet Message Format and its applications (email, news, voice messaging, EDI, etc.) and HTTP and applications using HTTP as a substrate. > This > is all what this proposition is about. This proposition is to give > _one_shot_ in a _standardised_ way the language, the script and the > country. This was discussed during Last Call of the previous non-IETF (individual submission) attempt. IIRC David Singer brought up several examples of other pieces of information (e.g. legal/copyright variations) that could also be negotiated and which might affect the presentation of content (or choice among alternative content). Lumping all of these separate items into one tag is a poor design as it impedes negotiation and tends toward lengthy tags which are incompatible with fixed-length mechanisms such as MIME encoded-words. While there is some mention of this issue in the document under discussion, its treatment and resolving the underlying issue in a manner that would minimize the problems are lacking. > It uses for that ISO codes. ISO never wanted to propose such > a code where: > > ar-arab-us are texts destined the people interested in US Arabic > community issues. > iw-hebr-ru are texts destined to people interested in Jewish Russian > community, > etc. > > When you browser accept that langtags and you pursue the relation, > this structured information can be filtered by ISP (for police, > censoring, intelligence gathering, etc.) to know about their users. > It can be used for searches on a large scale in search engines to > know the mail you responded, etc. I suppose that in most of the world > countries this is subject to privacy laws. I think that in France it > is subject to the anti-racist law (the one used against Yahoo a few years ago). Let's separate three issues: 1. privacy 2. tagging 3. negotiation The privacy issue exists whenever any information is conveyed; the user needs to balance privacy concerns with facilitation of communication. Mechanisms such as TLS can be used to limit the visibility of the information to the end points of communication; ultimately it boils down to a matter of trust in the end-point partner in the communication exchange. I believe that the issue is dealt with adequately in the security considerations section of the document under discussion (some mention of transport-level protection of privacy would be welcome). Tagging identifies characteristics of a particular piece of content. For that purpose alone, it makes little difference (other than regarding the aforementioned compatibility issues with existing IETF mechanisms) whether the characteristics are lumped or separate. There are existing IETF mechanisms which permit handling of either lumped or individual characteristics (e.g. the extensible header field mechanism of RFC 2045 and the "feature/filter" mechanism of RFC 2533/2738/2912). Tagging per se identifies characteristics of content. While that may be used to infer something about the content provider, such inferences may be unreliable, particularly for providers that support a wide variety of characteristics for the content in question. Negotiation of characteristics is where several issues arise. One such issue, as discussed here in December 2004/January 2005 relates to an algorithm for matching content characteristics (e.g. between a particular piece of content and a specified range of acceptance (as in an RFC 3282 Accept-Language field). RFC 3066 skirted that issue as it stopped short of specification of an algorithm, and as it specified a mere two particular characteristics (language per se, and country) which could be combined in a tag. That was not true of the individual submission, which combined at least 5 characteristics and specified an algorithm. As a result of issues with that approach, the LTRU WG was established with a charter to produce a BCP (for registration procedures) and a separate Standards Track document for topics such as algorithms which are unsuitable for BCP. A related issue is the interaction of the established negotiation mechanism (viz. the RFC 3282 Accept-Language field) and potential use of the other (feature/filter) mechanism for negotiation. The Accept-Language field provides for specification of language ranges and for associating a preference value with specific languages (as defined in RFC 3066) or ranges. The proposed mechanism in the individual submission of late last year (essentially unchanged in the LTRU product (see discussion below)) does not address the language range issue, and that issue is greatly complicated by conflating separate characteristics into a single tag. Addressing the language range issue is not a WG work item and, unfortunately, the algorithm issue is scheduled to be a later work item than the registry issue. Added to that is the fact that the specification of the tag format is mixed with registration procedures. Negotiation of separate characteristics is much simpler than that of a combined conflation of characteristics; each characteristic can be assigned separate preference values, and irrelevant characteristics (e.g. script w.r.t. spoken language) can be easily ignored. As negotiation and related issues represent a critical technical issue for the design of language tags (viz. keeping separate characteristics out of *language* tags), it is essential that such negotiation issues be considered carefully before specifying the format of tags. Unfortunately, that has not been done, and considering the published WG milestones it appears that that issue has not been taken into consideration. It should be pointed out that such issues have been raised, both in the discussion during Last Call of the individual submission and as a result of subsequent work. However, it appears that the WG has not considered the issues, with the effect that the WG product lacks the "particular care" expected of BCP documents (RFC 2026). Note that it is not the registration procedural issues that are typical of BCP documents that are problematic; rather it is the conflation of separate characteristics into a single tag syntax, specified in the same document, which raises problems related to content negotiation. Part of the problem is the scheduling of WG work items as noted above (viz. negotiation issues are critical to design of tag syntax, and should not have been deferred until after syntax specification). Another large part of the problem is WG management; in addition to the issues raised by John Klensin the last time that LTRU participation was discussed on the IETF discussion list -- and with which I wholeheartedly agree -- it appears that management of WG participant conduct has been rather lax; proponents of the individual submission effort who are participating in the WG tend to resort to ad-hominem attacks when a problem is identified or when an alternative approach is raised, with no visible intervention by the WG co-chairs. That has also (i.e. in addition to the factors which John identified) had the effect of limiting WG participation by individuals. Specification of "language" tag syntax which conflates other content characteristics prior to open and professional discussion of negotiation issues and alternative approaches would be a premature lock-in of a design choice. As the document under discussion specifies a conflation of such characteristics without open discussion -- indeed hampered by unchecked unprofessional conduct -- it should not be approved as BCP in its current form. Separation of syntax specification to a separate document, to be specified after due consideration of negotiation issues, leaving purely procedural issues of registration, would be one approach to enable making a decision on BCP registration procedures independently of an in advance of a concrete specification of negotiation issues and tag syntax. However, as it stands, the document cannot be evaluated for soundness of the tag syntax design in the absence of a specification that addresses negotiation issues (in a backwards-compatible manner with the existing negotiation mechanisms (viz. MIME Content- and Accept- fields and feature/filter negotiation). Therefore, at minimum, I recommend that the IESG defer a decision on the subject document until such time as the full impact of the early design choice to conflate multiple characteristics into a single tag can be fully evaluated w.r.t. proposed matching algorithms and impact on existing IETF-approved negotiation mechanisms. Revision to move the syntax specification to a separate document, as mentioned above, would permit evaluation of the registration procedures per se independently of such concerns, and would be one way to move forward on those registration procedures quickly (i.e. independently of analysis of the syntax design) if that is deemed desirable. Aside form that, the IESG (via the cognizant ADs) should address the issues of WG charter work items and milestones as they relate to consideration of negotiation issues prior to locking down a tag syntax specification, should emphasize the importance of backwards compatibility with established, approved, and widely deployed IETF protocols and mechanisms, and should discuss WG participant conduct (viz. ad-hominem attacks) and mailing list issues (as identified by JCK) with the WG co-chairs. _______________________________________________ Ietf@xxxxxxxx https://www1.ietf.org/mailman/listinfo/ietf