Re: Intermittent 500 responses in access.log without corresponding entries in error.log

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 10/22/2012 12:45 PM, Ben Johnson wrote:
> Hello,
> 
> I'm struggling to debug an issue that seems to affect two different and
> unrelated PHP scripts on a given website.
> 
> The essence of the issue is that the scripts in question (and only those
> scripts) return status code 500 responses, but only some small
> percentage of the time.
> 
> The effective php.ini is configured to report nearly all types of
> messages (error_reporting = 6143), yet nothing is logged when this
> occurs. I have confirmed that other types of PHP errors are indeed
> logged, which leads me to believe that PHP is not causing these 500
> responses, directly.
> 
> Here's a sample Apache log entry, from the vhost-specific log file,
> e.g., /var/www/example.com/log/access.log:
> 
> X.X.X.X - - [22/Oct/2012:07:37:29 -0700] "GET
> /auxiliary/css.php?file=global.css HTTP/1.1" 500 15501
> "http://example.com/product/"; "Mozilla/5.0 (compatible; MSIE 9.0;
> Windows NT 6.1; WOW64; Trident/5.0)"
> 
> More often than not, I see a subsequent log entry with the same
> timestamp and URL, but this time, the status code is 200 (instead of
> 500). Sometimes a couple of seconds elapse between the 500 response (and
> often there is more than one consecutive 500 response) and the 200
> response. Here is an example, given the above entry:
> 
> X.X.X.X - - [22/Oct/2012:07:37:29 -0700] "GET
> /auxiliary/css.php?file=global.css HTTP/1.1" 200 15481
> "http://example.com/product/"; "Mozilla/5.0 (compatible; MSIE 9.0;
> Windows NT 6.1; WOW64; Trident/5.0)"
> 
> There are two differences between these entries: the status code and the
> content length (that's the integer value immediately after the status
> code, right?). With a 500 response, the length is 15501. With a 200
> response, the length is 15481. Does this tell us anything useful? In
> php.ini, display_errors = off, so if a PHP error was the cause, I would
> expect the content length to be *smaller* when the error occurred (given
> that no error message is appended to the output), not larger.
> 
> Given that there is only one reference to the URL
> "/auxiliary/css.php?file=global.css" in my source HTML, why are there
> two entries in access.log?
> 
> Is Apache automatically re-requesting the file, internally, when a 500
> error occurs? This seems unlikely to me, but I don't know how else to
> interpret this. Also, it seems unlikely that the user-agent could make
> two separate requests in such rapid succession (within the same second),
> especially before it has any way to know that the first request resulted
> in a 500 response.
> 
> Another curious fact is that the corresponding entries in the
> server-wide log file (/var/log/apache2/other_vhosts_access.log) show
> zero for the content length:
> 
> example.com:80 X.X.X.X - - [22/Oct/2012:07:37:29 -0700] "GET
> /auxiliary/css.php?file=global.css HTTP/1.1" 500 0
> "http://example.com/product/"; "Mozilla/5.0 (compatible; MSIE 9.0;
> Windows NT 6.1; WOW64; Trident/5.0)"
> 
> Here are my log-related Apache directives:
> 
> -----------------------------
> LogLevel warn
> 
> #
> # The following directives define some format nicknames for use with
> # a CustomLog directive (see below).
> # If you are behind a reverse proxy, you might want to change %h into
> %{X-Forwarded-For}i
> #
> LogFormat "%v:%p %h %l %u %t \"%r\" %>s %O \"%{Referer}i\"
> \"%{User-Agent}i\"" vhost_combined
> LogFormat "%h %l %u %t \"%r\" %>s %O \"%{Referer}i\" \"%{User-Agent}i\""
> combined
> LogFormat "%h %l %u %t \"%r\" %>s %O" common
> LogFormat "%{Referer}i -> %U" referer
> LogFormat "%{User-agent}i" agent
> 
> #
> # Define an access log for VirtualHosts that don't define their own logfile
> CustomLog /var/log/apache2/other_vhosts_access.log vhost_combined
> -----------------------------
> 
> Is there any means by which to acquire more information about the root
> cause of a 500 response? Should I increase the LogLevel value? Or is
> that pointless here?
> 
> As stated above, there isn't a single entry in the vhost-specific Apache
> error log (to which all PHP errors are logged) for the same 24-hour
> period in question, so I don't expect PHP to be able to tell us anything
> more.
> 
> Maybe somebody else is able to derive additional meaning from the log
> entries and content lengths.
> 
> Apache/2.2.14 (Ubuntu)
> PHP 5.3.2-1ubuntu4.18
> Server API CGI/FastCGI
> 
> Thanks for any help!
> 
> -Ben

A follow-up to my previous message.

I found an explanation that seems to fit some of the symptoms at
https://httpd.apache.org/mod_fcgid/mod/mod_fcgid.html :

----------------------------
Special PHP considerations

By default, PHP FastCGI processes exit after handling 500 requests, and
they may exit after this module has already connected to the application
and sent the next request. When that occurs, an error will be logged and
500 Internal Server Error will be returned to the client. This PHP
behavior can be disabled by setting PHP_FCGI_MAX_REQUESTS to 0, but that
can be a problem if the PHP application leaks resources. Alternatively,
PHP_FCGI_MAX_REQUESTS can be set to a much higher value than the default
to reduce the frequency of this problem. FcgidMaxRequestsPerProcess can
be set to a value less than or equal to PHP_FCGI_MAX_REQUESTS to resolve
the problem.
----------------------------

PHP_FCGI_MAX_REQUESTS=5000 on the vhost in question. Per the above, does
this mean that I can expect some percentage of 5001st requests to
trigger a 500 error for which absolutely no additional information is
logged? That is somewhat disconcerting.

Also, this limitation does not explain the following in any obvious way:

1.) This occurs far too frequently for 5000 PHP requests to have been
serviced between each occurrence. This is a fairly low-traffic server
with a dozen or so vhosts.

2.) I would expect the intervals at which the error occurs to exhibit a
high degree of positive correlation with the traffic cycles on this
server, and while I lack sufficient data (as yet) to be able to make
that determination, I see no correlation at all.

3.) The intermittent 500 errors occur with two specific PHP files only.
Given the above explanation, even though these are two of the more
popular URLs on the site, I would expect to see more variation with
respect to the particular PHP file that was requested when the 500
occurred. For what it's worth, the referring URL (always another page on
the same site) varies widely, which seems expected.

Is it possible and perhaps even likely that this problem is observed
most commonly with concurrent requests from the same remote host?
Because the most commonly offending file is a CSS wrapper script that is
*always* requested almost immediately after whichever page includes it
in the HTML. However, this doesn't explain why the 500 response occurs
with the second file, which is a download wrapper script to serve-up
binary files that live outside Apache's document root.

I suppose the next step is to further increase PHP_FCGI_MAX_REQUESTS and
see if the frequency of these 500 responses decreases.

-Ben

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@xxxxxxxxxxxxxxxx
For additional commands, e-mail: users-help@xxxxxxxxxxxxxxxx



[Index of Archives]     [Open SSH Users]     [Linux ACPI]     [Linux Kernel]     [Linux Laptop]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Squid]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]

  Powered by Linux