Hi,
We use Apache's mod_proxy to reverse-proxy a web application that sends back large XML responses whose lengths are unknown in advance. IOW, the responses sent by the application (aka the backend) use the chunked transfer encoding. We configured mod_proxy to use chunked coding on the frontend as well, via "SetEnv proxy-sendchunked".
If the application runs into an error condition while it is in the middle of writing the response body, the HTTP status of 200 has already been sent. The only way to communicate that error downstream is to close the connection without sending the terminating zero-length chunk. This works well if we point the user agent, e.g. curl, directly at the web application. Curl properly detects the incomplete response and exits with a non-zero status code, despite the HTTP status code of 200. If we point the curl at Apache, this stops working and curl exits with 0, falsely indicating success. We believe that the premature closing of the connection by the backend goes unnoticed somewhere between ap_http_filter() and ap_proxy_http_process_response(). Consequently, ap_http_chunk_filter() terminates the frontend response with a zero-length chunk, even though the backend response wasn't terminated by one.
IOW, an incomplete response body on the backend is turned into a complete response body on the frontend. We are reasonably confident that this is in violation of RFC 2616, sections 3.4 and 3.6.1. I captured the backend and frontend HTTP exchanges in [4] and [5]. Note the terminating 0 chunk in the frontend exchange [5] that is missing from the backend exchange [4]. Interestingly, ap_http_chunk_filter() is already able to handle dropped backend connections but that particular code path isn't activated because no error bucket with HTTP_BAD_GATEWAY is ever add to the brigade.
I am unsure, at this point, how to fix this properly. Mostly because I don't know my way around Apache's inner workings. I do have a patch [1] that fixes this behavior in our particular case but I am not sure whether 1) it catches all such conditions or 2) it doesn't severely break other cases. I wrote a small Python HTTP server [2] that emulates our web application's behavior. It needs Tornado "pip install tornado" and can be run with "python server.py 8881". The relevant section of our httpd.conf is at [3].
This occurs in 2.2.15 and 2.2.25. The patch is against the latter. I didn't try 2.4.x. For the sake of completeness the curl invocation reads
or
to hit the backend directly.
--
Hannes Schmidt
Software Application Developer
Data Migration Engineer
Cancer Genomics Hub
University of California, Santa Cruz
(206) 696-2316 (cell)
hannes@xxxxxxxx