Chunked + Gzip

Michael Caplan <michael@xxxxxxxxxxx> · Sat, 13 Sep 2008 11:02:11 -0300

Hi,

I have a question about how Transfer-Encoding: chunked works with a 

Content-Encoding gzip.  Reading the HTTP 1.1 RFC, 

http://en.wikipedia.org/wiki/Chunked_transfer_encoding and other 

discussions on the net that touch on this subject I'm a little confused 

on how the web server and browser client handles preparing and reading 

the data.

The RFC isn't clear on this point (or at least I'm not finding the right 

information), but what I have gathered is that:

1. the gzip content encoding happens on the entire body before it is 

chunked.

2. the ungzipping happens on the entire body after it is dechunked.

If I got this right (which I don't think I do), the web server would 

need to first dechunk data produced from a dynamic source (PHP) before 

it can apply the gzip content encoding.  For example, mod_gzip would not 

apply the content encoding until it dechunked the data 

(http://schroepl.net/projekte/mod_gzip/config.htm) and then delivered it 

to the client.

Likewise, on the client end, it would only be able to begin interpreting 

HTML following receiving the entire chucked payload, dechunk it, and 

then ungzip it.

But, that seems contrary to what Apache + mod_deflate actually does, as 

well as my browser (Firefox).  For example, I can create a PHP script 

that controls the chunks created by calling the flush() function:

<html>
   <head>
       <title>Hi</title>

       <link rel="stylesheet" href="style.css" type="text/css" 

media="all" />

   </head>
<?php flush(); sleep(5); ?>
   <body onload="loaded();">
       <h1>Hi</h1>
   </body>
</html>

The complete client server communication looks like this:

GET /samples/flush/test.php HTTP/1.1
Host: ***

User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.16) 

Gecko/20080703 Mandriva/2.0.0.16-1.1mdv2008.1 (2008.1) Firefox/2.0.0.16 

FirePHP/0.1.0

Accept: 

text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5 

Accept-Language: en-ca,en;q=0.7,en-us;q=0.3
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive

HTTP/1.1 200 OK
Date: Sat, 13 Sep 2008 13:03:26 GMT
Server: Apache
Vary: User-Agent,Accept-Encoding
Content-Encoding: gzip
Keep-Alive: timeout=5, max=100
Connection: Keep-Alive
Transfer-Encoding: chunked
Content-Type: text/html

6c

..........D.. .0..{..<.Y 

I...<..A....AI$.../.g.5...g..a.&...x...T...h.b..d...../.\

.='..ns..X$.N.L.;].......
32
....S*...r..Sl.@xxxxxxxxxxx`..).n..;......F.L=....
0

If I have Apache + mod_deflate configured to gzip up the output, the 

chunks created reflect where I flush in the PHP script -- in this case 

two chunks -- one for the header, and another for the body.  If I put in 

an artificial time delay that will delay the delivery of the second 

chunk (as I did above with sleep(5)), I can also see two other 

interesting things. 

1. The first gzip compressed chunk is delivered independently of the 

second chunk (that comes 5 seconds later).  Which indicates the gzipping 

is happening chunk by chunk, not on the entire body at once.

2. Second, the browser receives the first gzip chunk and is able to 

interpret it _before_ it gets the entire payload and dechunks it all.  I 

say this because while the browser is waiting on the second chunk, it 

will download the referenced CSS file.

This seems to fly in he face of what I've read on how it is supposed to 

work.  Instead, it appears to be working like this:

1. the gzip content encoding happens on the entire body, chunk by chunk.
2. the ungzipping happens on the entire body, chunk by chunk.

Is this behavior noted mean I am mis interpreting the HTTP RFC, or is 

the implementation not compliant?  Can anyone shine some light on the 

subject?

Thanks,

Mike

---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@xxxxxxxxxxxxxxxx
  "   from the digest: users-digest-unsubscribe@xxxxxxxxxxxxxxxx
For additional commands, e-mail: users-help@xxxxxxxxxxxxxxxx