On 00:40 26/08/2005, David Hopwood said:
JFC (Jefsey) Morfin wrote:
[...] Today, the common practice of nearly one billion of Internet
users is to be able to turn off cookies to protect their anonymous
free usage of the web. Once the Draft enters into action they will
be imposed a conflicting privacy violation: "tell me what you read,
I will tell you who you are": any OPES can monitor the exchange,
extact these unambigous ASCII tags, and know (or block) what you
read. You can call these tags in google and learn a lot about
people. There is no proposed way to turn that personal tagging off,
nor to encode it.
I don't know which browser you use, but in Firefox, I can configure exactly
which language tags it sends. If it were sending other information using
language tags as a covert channel (which it *could* do regardless of the
draft under discussion), I'd expect that to be treated as at least a bug,
and if it were a deliberate privacy violation, I'd expect that to cause a
big scandal.
Dear David,
the privacy problem is the "what you read, who you are" intelligence
leak. Today langtags are not yet much used (say the W3C people in the
WG-ltru) when compared with what they should in XML, HTML, etc. This
is all what this proposition is about. This proposition is to give
_one_shot_ in a _standardised_ way the language, the script and the
country. It uses for that ISO codes. ISO never wanted to propose such
a code where:
ar-arab-us are texts destined the people interested in US Arabic
community issues.
iw-hebr-ru are texts destined to people interested in Jewish Russian
community,
etc.
When you browser accept that langtags and you pursue the relation,
this structured information can be filtered by ISP (for police,
censoring, intelligence gathering, etc.) to know about their users.
It can be used for searches on a large scale in search engines to
know the mail you responded, etc. I suppose that in most of the world
countries this is subject to privacy laws. I think that in France it
is subject to the anti-racist law (the one used against Yahoo a few years ago).
The problem is that there is no way for the _receiver_to turn it
down. This is potentially dangerous spam: it is a digital information
I never asked for, which discloses information on me.
Is that a reason why to kill the Draft? I do not think so, but it
certainly shows the complexity of the issue - and the lack of
preparation of the Draft (I proposed the Security section to better
warn about the problem). IETF proposes a solution: it is the OPES. An
OPES on the host side can remove the langtags or to encrypt them at
the request of the reader, without a change on the host. I tried to
make the WG-ltru understand that not considering/reminding OPES at
the same time as documenting langtags is criminal.
This is why the default proposition I make (the Draft's ABNF and
system being considered as a starting default proposition, and hooks
open to IRI Tags adapted to each situation at the decision of the
user or of services he trusts).
Let take the case above. A service provider can propose an OPES
service, changing "he-hebr-us" into "x-abcf" and an internal OPES
plug-in to the user to restore x-abcf into he-hebr-us, so his
libraries work. And mani L9 organisations/Governments are satisfied.
He can even provide dynamically updated langtag aliases. However, a
good service should warranty the service is conflict free. This is no
problem if the langtag alias is x-service.com:abcf (conforming with
URI Tag RFC), but this is forbidden by the Draft. My proposition is
to use "0-" has a hook to specific format, so the Draft ABNF is fully
respected.
In that case "0-service.com:abcf will be not rise an error. And will
not conflict with the people using the default format (the format
proposed by the Draft). The interest of "0-" is that it can be
multilingual, so the hook can work in ASCII but also in punycode, and
in any script. It can also be entirerly numeric and possibly refer
directly to an IPv6 address, making the scheme DN independent.
I support it as a transition standard track RFC needed by some,
as long as it does not exclude more specific/advanced language
identification formats, processes or future IANA or ISO 11179
conformant registries.
The grammar defined in the draft is already flexible enough.
(I suppose you mean more than just grammar. Talking of the ABNF is
probably clearer?).
I am certainly eager to learn how I can support modal information
(type of voice, accent, signs, icons, feelings, fount, etc.),
medium information, language references (for example is it plain,
basic, popular English? used dictionary, used software publisher),
nor the context (style, relation, etc.), nor the nature of the text
(mono, multilingual, human or machine oriented - for example what
is the tag to use for a multilingual file [printed in a language of
choice]), the date of the langtag version being used, etc.
I mean that the grammar is flexible enough to encode any of the
above attributes (not that it would be useful or a good idea to encode most
of them).
hmmm.... you take the responsibility of both declarations :-)
- that you _can_ encode it. But you do not provide examples.
- that it would not be useful or a good idea to encode basic content
object attributes.
The Draft has introduced the "script" subtag in addition to RFC
3066 (what is an obvious change). However in order to stay
"compatible" with RFC 3066, author says it cannot introduce a
specific support of URI tags.
This objection seems to be correct: URI tags include characters not
allowed by RFC 3066.
Then? The purpose of this work is to address the limitations of RFC
3066. URI tags did not exist when RFC 3066 was written. Do you mean
for example that langtags are to be ASCII only because RFC 3066 was ASCII only?
But you could easily encode the equivalent information to an URI
tag, if you wanted to.
please document how do you do, while respecting the hybrid format of
the proposed ABNF where information is not indentified by fixed
position, but also relative position and size, with "-" as sole
separator. And they want to keep labels between "-" 8 characters
long. Tell me how you support IDNs.
Let suppose that I have "lang-tags.org:" as a scheme.
or "xn--abcdef.com:". Tell me how you support them
jfc
_______________________________________________
Ietf@xxxxxxxx
https://www1.ietf.org/mailman/listinfo/ietf