----- Start Opprinnelig Melding ----- Sendt: Sat, 18 Oct 2008 19:02:17 +0200 Fra: André Warnier <aw@xxxxxxxxxx> Til: users@xxxxxxxxxxxxxxxx, Tomcat Users List <users@xxxxxxxxxxxxxxxxx> Emne: I18N, HTTP 2.0 ? > I am sending this to both the Apache httpd and Tomcat users lists, in > the hope that because together these HTTP servers cover a good fraction > of the market, there might be a chance to reach the righ people. > > My hope is that someone who is aware of, and connected to, the process > of RFC generation would pick this up, or else inform us if some process > in the direction that I am indicating below is already under way. > > I apologise in advance if I am crashing an open door. If so, I would > gladly accept to be informed about what the state of affairs is. > (A Google search on the terms "HTTP" and "RFC" and "UTF-8" does not seem > to yeld any relevant results.) > > Proposal : > > It is becoming urgent to create a new HTTP standard/version/revision, > that would be organised around Unicode as a default character set, and > UTF-8 as a default encoding. > > I believe that the spread and acceptance of Unicode and UTF-8 is now > sufficient to warrant such an evolution. > > The current situation, where iso-8859-1 is the default in some areas, > and some other areas are either unspecified or vague, creates a lot of > confusion and inefficiencies, and creates barriers to the creation of > truly international HTTP-based WWW applications. > > Here are some areas where these problems appear : > - the encoding of URLs. > - the encoding of HTTP headers. > - the encoding of user credentials in browser-side Basic and Digest > authentication dialogs, and their transmission to the server. > - the encoding of input elements from html forms, as transmitted by a > client to a server, and the interpretation of ditto data by the server > > I am quite sure that I am forgetting some aspects of the same issue. > > For each of the above, there are areas where there is no specification, > or areas where there are vague specifications, or areas where there are > multiple apparently-contradictory specifications. > Consequently, there is a profusion of ad-hoc tricks and receipes, and > there start to appear various "parameters" and "flags" and "settings" at > the client and server level, which may help resolving the issues in some > cases, but which in the long term create even more confusion and > problems of interoperability. > (example of a setting : "use body encoding for URL"). > > There might be some efforts under way to tackle one or the other aspect > of the above (I have heard of a proposal regarding HTTP headers), but I > honestly believe that this issue can only be resolved well "at the top", > which seems to me the HTTP protocol itself. ----- Slutt Opprinnelig Melding ----- I just want to say that I agree with you in recommending UTF-8 as the default character encoding. It has been a natural evolution toward richer character sets, but the HTTP (and other) standards have not followed this evolution. I doubt, however, that the HTTP---one for the web's core protocol---will be revised just to make room for internationalisation. More needs need to be addressed at the same time to make something happen in this area. Personally I would want to see the HTTP user-error 402 (Payment Required) specified in the upcoming specs. There are so many for-pay web-sites/services around that this should have been specified a long time ago. -- Daniel Aleksandersen --------------------------------------------------------------------- The official User-To-User support forum of the Apache HTTP Server Project. See <URL:http://httpd.apache.org/userslist.html> for more info. To unsubscribe, e-mail: users-unsubscribe@xxxxxxxxxxxxxxxx " from the digest: users-digest-unsubscribe@xxxxxxxxxxxxxxxx For additional commands, e-mail: users-help@xxxxxxxxxxxxxxxx