Re: Generating a gzip response from multiple pre-gzipped files on disk (Apache Users)

Can you post the headers, from sending the request(s) up to and including the response(s) ?

I think you might be hitting the same spot as I recently did in (1). In short, most (if not all) popular clients do not unpack responses if they think they shouldn't even if the headers tell them to. So for example, "Content-Encoding: gzip, deflate" will not have my Firefox run gunzip on a file like "data.gz". At this point I can only speculate because I did not dig deeper with the client behaviour but I *think* this is because they sniff in on the content or at least on the file ending.

(1) http://mail-archives.apache.org/mod_mbox/httpd-dev/201401.mbox/%3CCAPV0b06Z6Yey7Wa6gACCyrxui36WnB5gvJxQwCSWiZMahgnynQ%40mail.gmail.com%3E

On Thu, Feb 6, 2014 at 6:54 PM, Tom Evans <tevans.uk@xxxxxxxxxxxxxx> wrote:

Hi all

At $JOB we have a web app that generates XML for another web app to
use. Each complete XML document is a list of individual items, and
each item is stored on disk, in gzip format to save space - the format
is overly verbose, and compression is highly effective, and gzip is
nicely transparent to lots of utilities (vim mainly).

Currently, a django app assembles the document together (it also
generates them if they are missing, but lets ignore that for now). It
first reads each file off disk, decompresses it, assembles one large
string (sometimes 100MB+ XML), compresses it again (sigh) and then
hands it off to apache.

As a naive attempt, I modified the django app to simply load the file
from disk, pre- and append a compressed header and footer, and then
hand that off to apache with the appropriate content type.

This "worked" in some respects - downloading the file to disk using
fetch, then gzcat+md5 confirmed that the uncompressed response was
bit-for-bit, but all "real" web clients I gave it to (firefox, chrome,
libcurl) would only see the first chunk - the header, where as gzcat
sees all the chunks.

So, my questions are two-fold:

1) Is there something in the gzip file header which makes this approach a no-go
2) Is there any approach in stock httpd that could assemble docs like
this (if it is even possible), or would I be looking at a custom
module?

I appreciate only the second one is really on topic here :)

Cheers

Tom

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@xxxxxxxxxxxxxxxx
For additional commands, e-mail: users-help@xxxxxxxxxxxxxxxx