Hi, It's been 10 years since my last message to this mailing list, and I'm happy to join it again :-) I've encountered a surprising phenomenon with Apache's mod-cgi, which unnecessarily slows it down for huge outputs, and as a "bonus" also has a bug: taking up huge amounts of memory: I have a CGI program which very quickly writes 512 MB of output. When I use it in Apache, Apache itself (NOT the CGI process!) grows by 512 MB (!). I was really surprised by this, because ideally Apache should hardly grow at all, as at most (if at all) it should be reading modest-sized buffers from the CGI script and writing them back to the socket. I looked at the httpd code, discovered (if I understand correctly) that 1. As I already guessed, Apache doesn't let the CGI write directly to the socket, but rather asks it to write to a pipe, which Apache then reads. 2. When Apache reads this data from the pipe, it doesn't write it directly but rather just adds it to a "bucket brigade" which collects more and more data. It appears there is no flow-control in this process: If the CGI outputs faster than we can send to the network, the bucket brigade becomes longer and longer, and with 512 MB of output quickly generated, up to 512 MB of buffers are allocated, and only much of it is only proccessed and freed at the end. The peak memory usage, then, is 512 MB, and this is also the process's memory usage when everything ends (because Apache doesn't return this memory to the system). I confirmed that this is indeed a flow-control problem by changing the CGI to sleep for 1 second after outputting each 64 MB (i.e., 8 batches of 64 MB output); Now, the memory usage was around 64 MB, not 512 MB, because Apache had the time to output each batch and free its memory before the next batch came. By the way, the growth of the Apache process by 512 MB is only the start of the problem, because not only every *process* grows by 512 MB, actually even in the worker MPM every *thread* grows by 512 MB because apparently (?) Apache's memory pools are separate for different threads, so the 512 MB freed by one thread is not reused by a different threads. In my default setup of 25 threads, all of the machine's memory and swap space was consumed :( So now I guess my questions are: 1. Has anyone ever thought of doing a "direct CGI" module, where the CGI script writes directly to the socket, not to Apache's pipe, forgoing any copying, buffering or filtering in Apache? Does something like this already exist? Is "NPH" relevant here? 2. Even if we do want Apache's output filtering capabilities, are there really no flow control capabilities? Can we tell Apache not to read more input (i.e., CGI's output) if the bucket brigade is larger than some predefined size (e.g., 1 MB)? 3. Some of you may think that CGI is antiquated, and I shouldn't be using it - but I do have good reasons to use it ;-) But I wonder (I didn't test) - is this problem specific to CGI? What happens when we serve a huge disk *file*, and we can read it faster than we can send it - does the bucket brigade also grow indefinitely? Thanks, Nadav. -- Nadav Har'El | Monday, Jul 16 2012, 26 Tammuz 5772 nyh@xxxxxxxxxxxxxxxxxxx |----------------------------------------- Phone +972-523-790466, ICQ 13349191 |Unix is user friendly - it's just picky http://nadav.harel.org.il |about its friends. --------------------------------------------------------------------- To unsubscribe, e-mail: users-unsubscribe@xxxxxxxxxxxxxxxx For additional commands, e-mail: users-help@xxxxxxxxxxxxxxxx