Re: Re: How to configure Apache 2 to compress xml files on serving?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





Bo Berglund wrote:
On Sat, 14 Jun 2008 09:32:26 +0200, André Warnier <aw@xxxxxxxxxx>
wrote:

HTTP/1.x 200 OK
Date: Sat, 14 Jun 2008 06:33:12 GMT
Server: Apache/2.0.53 (Fedora)
Last-Modified: Thu, 12 Jun 2008 14:10:29 GMT
Etag: "14fc-b9387f40"
Accept-Ranges: bytes
Content-Length: 5372
Cache-Control: no-transform
Keep-Alive: timeout=15, max=100
Connection: Keep-Alive
Content-Type: application/xml
Content-Encoding: gzip

------------------ my test server  ------------------------

HTTP/1.x 200 OK
Date: Sat, 14 Jun 2008 06:34:38 GMT
Server: Apache/2.0.54 (Win32) PHP/4.4.7
Last-Modified: Thu, 12 Jun 2008 22:20:20 GMT
Etag: "55084-14fc-91160693"
Accept-Ranges: bytes
Content-Length: 5372
Keep-Alive: timeout=15, max=100
Connection: Keep-Alive
Content-Type: application/x-gzip
Content-Encoding: gzip
------------------------------------------------------------------------------------


In the server responses I see these differences:

Cache-Control: no-transform  (not existing in test server)
Content-Type: application/xml

(test server has this instead:)
Content-Type: application/x-gzip

How is the tag "Content-Type" set in Apache?
Exactly. Because in the second case, the browser gets "application/gzip" as the content-type, it thinks that what it has received is ok as is, and does not unzip it. While in the first case, because it gets "application/xml", it "knows" that the content is really xml, and that it must unzip it first.

So new we must find what, in the first server, sets the content-type that way. One more question : on the first server, is the original file on disk already gzipped, or is it in xml (unzipped) format on the disk ?

Since I don't have the configuration of the first server, I'm trying to guess what it exactly does before it sends out the response. It could be taking an xml file, and gzipping it on-the-fly, before it sends it in the response. Or else, it could be "cheating", taking the already gzipped file from disk, and sending it as is, but "falsifying the headers" to tell the browser to unzip it.
It may be as simple as adding (or replacing) some line
AddType application/xml .xml.gz


I changed httpd.conf like this:

<Directory "C:/Engineering/Projects/XMLTV/XMLTVTestsite">
    Options Indexes MultiViews Includes
    AllowOverride None
    Order allow,deny
    Allow from all
    AddType application/xml .xml.gz
    AddEncoding gzip .gz
    AddType text/xml .xml
    AddType text/html .shtml
</Directory>


But FireFox still offers to save the file rather than decompressing
and showing the xml like it does from the original server:

HTTP/1.x 200 OK
Date: Sat, 14 Jun 2008 10:39:58 GMT
Server: Apache/2.0.54 (Win32) PHP/4.4.7
Last-Modified: Thu, 12 Jun 2008 22:19:12 GMT
Etag: "5b091-13b0-8d04e669"
Accept-Ranges: bytes
Content-Length: 5040
Keep-Alive: timeout=15, max=100
Connection: Keep-Alive
Content-Type: application/x-gzip
Content-Encoding: gzip
----------------------------------------------------------

With this change:
<Directory "C:/Engineering/Projects/XMLTV/XMLTVTestsite">
    Options Indexes MultiViews Includes
    AllowOverride None
    Order allow,deny
    Allow from all
    AddType application/xml .xml.gz
    AddType text/xml .xml
    AddType text/html .shtml
</Directory>


Add the following directive to the above section :
 AddEncoding x-gzip .gz

and try again


I get this instead:

HTTP/1.x 200 OK
Date: Sat, 14 Jun 2008 10:41:43 GMT
Server: Apache/2.0.54 (Win32) PHP/4.4.7
Last-Modified: Thu, 12 Jun 2008 22:19:30 GMT
Etag: "5b225-1277-8e1f670e"
Accept-Ranges: bytes
Content-Length: 4727
Keep-Alive: timeout=15, max=100
Connection: Keep-Alive
Content-Type: application/x-gzip
----------------------------------------------------------

With this in place I started looking elsewhere in httpd.conf and found
this line, which I commented out:

AddType application/x-gzip .gz .tgz


What happened now is that FireFox displays an error message:

XML Parsing Error: not well-formed
Location: http://polaris/xmltv/svt1.svt.se_2008-06-15.xml.gz
Line Number 1, Column 1

and the headers now are:

HTTP/1.x 200 OK
Date: Sat, 14 Jun 2008 10:48:07 GMT
Server: Apache/2.0.54 (Win32) PHP/4.4.7
Last-Modified: Thu, 12 Jun 2008 22:18:36 GMT
Etag: "5ae5a-169d-8aea1e6f"
Accept-Ranges: bytes
Content-Length: 5789
Keep-Alive: timeout=15, max=100
Connection: Keep-Alive
Content-Type: text/xml
----------------------------------------------------------

Probably now FireFox does not realize that the data are gzipped
anymore and tries to parse the binary compressed stream, which
obviously fails...

Yes. Because the server tells Firefox that the document is "text/xml" and Firefox believes it. That is the right thing to do for Firefox, according to the corresponding Internet RFC's. (Unfortunately, that's not what IE does, but that is a whole separate story, in which I hope we don't have to get).

Have to re-enable this directive...

No.  Leave this one commented out :
> AddType application/x-gzip .gz .tgz

But add what I indicated above to your Directory section :
 AddEncoding x-gzip .gz

Note : I am also "fishing" to find the right settings.
But you have to do this systematically, without getting lost about what you add/remove, otherwise we will not know anymore. The important part is what the server sends as headers with the HTTP response.
We must get to a situation where it sends :
Content-Type: text/xml  (or application/xml ?)
Content-Encoding: gzip  (or x-gzip ?)

So that Firefox knows that is is XML, but that it is gzipped.

---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@xxxxxxxxxxxxxxxx
  "   from the digest: users-digest-unsubscribe@xxxxxxxxxxxxxxxx
For additional commands, e-mail: users-help@xxxxxxxxxxxxxxxx


[Index of Archives]     [Open SSH Users]     [Linux ACPI]     [Linux Kernel]     [Linux Laptop]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Squid]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]

  Powered by Linux