Fabio F. Gervasi Twitter twitter.com/centrometeo Facebook facebook.com/centrometeo | ||
Things you can do are trying to track down if there is any pattern - are these first request on a child, "nth" request on child etc... we add additional variables to access logs which include things such as PID/request no in PID, memory usage before and after etc. This allows us sometimes to see patterns in errors which aren't normally easy to see..
Saying that it may not help you here...
Another thing you can do with ab is check to see if it is OK under lower load?
Finally I find "siege" better than "ab" when it comes to testing as it is more configurable - you can send it a list of URLs - run test for "n-seconds" rather than "n-requests" which is more useful if you don't know how long things take to return!
James-- The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE.
On 02/10/2016 09:02, Daniel wrote:
I would bet on your method of parsing php and the scripts being ran, in any case to be sure:Try the strace approach:
strace -o /tmp/outputfile -s 5000 httpd -X or strace -ff -F -s200 -o /tmp/strace.out -p PID
to try to find out what's really going on.
2016-09-30 20:19 GMT+02:00 Fabio F.Gervasi <fabio.gervasi70@xxxxxxxxx>:
Hi!
I have additional information:
/var/log/httpd/error_log, at "crash time": [Fri Sep 30 19:09:03.897325 2016] [mpm_event:trace4] [pid 30339:tid 139796798162688] event.c(930): socket reached timeout in lingering-close state
What do you think?
2016-09-30 14:47 GMT+02:00 Fabio F.Gervasi <fabio.gervasi70@xxxxxxxxx>:
Hi!
Thank you for your reply. I tried the following tests.
1) Using a little text file:# ab -k -c 100 -n 2000 localhost/test.txtThis is ApacheBench, Version 2.3 <$Revision: 1748469 $>Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/Licensed to The Apache Software Foundation, http://www.apache.org/
Benchmarking localhost (be patient)Completed 200 requestsCompleted 400 requestsCompleted 600 requestsCompleted 800 requestsCompleted 1000 requestsCompleted 1200 requestsCompleted 1400 requestsCompleted 1600 requestsCompleted 1800 requestsCompleted 2000 requestsFinished 2000 requests
Server Software: Apache/2.4.23Server Hostname: localhostServer Port: 80
Document Path: /test.txtDocument Length: 52 bytes
Concurrency Level: 100Time taken for tests: 2.521 secondsComplete requests: 2000Failed requests: 16(Connect: 0, Receive: 0, Length: 16, Exceptions: 0)Keep-Alive requests: 1984Total transferred: 950336 bytesHTML transferred: 103168 bytesRequests per second: 793.45 [#/sec] (mean)Time per request: 126.032 [ms] (mean)Time per request: 1.260 [ms] (mean, across all concurrent requests)Transfer rate: 368.18 [Kbytes/sec] received
Connection Times (ms)min mean[+/-sd] median maxConnect: 0 1 3.8 0 21Processing: 1 123 126.5 83 772Waiting: 0 123 126.7 83 772Total: 1 124 126.6 85 772
Percentage of the requests served within a certain time (ms)50% 8566% 15475% 20180% 22990% 30395% 37398% 45299% 547100% 772 (longest request)
I obtain 16 failed requests, but if I run more again I obtain a different number.
2) Using a big gif file:# ab -k -c 100 -n 2000 localhost/it-vis-animation.gifThis is ApacheBench, Version 2.3 <$Revision: 1748469 $>Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/Licensed to The Apache Software Foundation, http://www.apache.org/
Benchmarking localhost (be patient)Completed 200 requestsCompleted 400 requestsCompleted 600 requestsCompleted 800 requestsCompleted 1000 requestsCompleted 1200 requestsCompleted 1400 requestsCompleted 1600 requestsCompleted 1800 requestsCompleted 2000 requestsFinished 2000 requests
Server Software: Apache/2.4.23Server Hostname: localhostServer Port: 80
Document Path: /it-vis-animation.gifDocument Length: 8105309 bytes
Concurrency Level: 100Time taken for tests: 26.294 secondsComplete requests: 2000Failed requests: 0Keep-Alive requests: 2000Total transferred: 16211484000 bytesHTML transferred: 16210618000 bytesRequests per second: 76.06 [#/sec] (mean)Time per request: 1314.691 [ms] (mean)Time per request: 13.147 [ms] (mean, across all concurrent requests)Transfer rate: 602100.83 [Kbytes/sec] received
Connection Times (ms)min mean[+/-sd] median maxConnect: 0 1 4.0 0 22Processing: 404 1309 80.0 1298 1635Waiting: 5 22 32.4 10 207Total: 420 1310 80.6 1298 1635
Percentage of the requests served within a certain time (ms)50% 129866% 131575% 134280% 136090% 139795% 144098% 151399% 1530100% 1635 (longest request)
In this case I obtain always 0 failed!
3) Using a little png file:# ab -k -c 100 -n 2000 localhost/apple-touch-icon.pngThis is ApacheBench, Version 2.3 <$Revision: 1748469 $>Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/Licensed to The Apache Software Foundation, http://www.apache.org/
Benchmarking localhost (be patient)Completed 200 requestsCompleted 400 requestsCompleted 600 requestsCompleted 800 requestsCompleted 1000 requestsCompleted 1200 requestsCompleted 1400 requestsCompleted 1600 requestsCompleted 1800 requestsCompleted 2000 requestsFinished 2000 requests
Server Software: Apache/2.4.23Server Hostname: localhostServer Port: 80
Document Path: /apple-touch-icon.pngDocument Length: 12364 bytes
Concurrency Level: 100Time taken for tests: 2.576 secondsComplete requests: 2000Failed requests: 22(Connect: 0, Receive: 0, Length: 22, Exceptions: 0)Keep-Alive requests: 1978Total transferred: 25304554 bytesHTML transferred: 24455992 bytesRequests per second: 776.52 [#/sec] (mean)Time per request: 128.779 [ms] (mean)Time per request: 1.288 [ms] (mean, across all concurrent requests)Transfer rate: 9594.53 [Kbytes/sec] received
Connection Times (ms)min mean[+/-sd] median maxConnect: 0 1 3.9 0 20Processing: 0 125 112.0 101 563Waiting: 0 125 112.1 101 563Total: 0 126 112.6 102 563
Percentage of the requests served within a certain time (ms)50% 10266% 15975% 19680% 22090% 28295% 35498% 40199% 457100% 563 (longest request)
I get 22 failed requests, but if I run more again I obtain a different number.
What do you thing?
Thank you!
2016-09-30 13:50 GMT+02:00 Daniel <dferradal@xxxxxxxxx>:
hello,
the fact that all those 500 error reponses go with "X-Powered-By: PHP/5.6.26" tell me it is php giving those errors.
If you are still unsure, you could use the ab tool from Apache HTTPD, and request a specific static file (text, gif) hundreds of times to see if you get a 500 error from HTTPD.--
2016-09-30 8:59 GMT+02:00 Fabio F.Gervasi <fabio.gervasi70@xxxxxxxxx>:
Hi,
I'm not 100% sure that the real source of the problem is apache (could be php or other), but I'd like start from here as the only logs that provides me with information (with trace4 level) is apache log. No information appears from other logs.
I have:
- Linux 4.7.4-100.fc23.x86_64
- Server version: Apache/2.4.23 (Fedora)
- mysql Ver 15.1 Distrib 10.0.26-MariaDB, for Linux (x86_64) using readline 5.1
- PHP Version => 5.6.25
The problem: random internal server error 500. Random because, for example, if a page produces the error and do "reload" then it works. The same page may not work as well as work. Sometimes you can use the site for several minutes with no problems, then suddenly there is the error.
When error occurs, this is all that I get from the log file (with loglevel = trace4):
[...][Fri Sep 30 07:41:39.151052 2016] [rewrite:trace3] [pid 17957:tid 140241771534080] mod_rewrite.c(477): [client 151.25.206.200:38368] 151.25.206.200 - - [www.centrometeo.com/sid#55c3e3673400][rid#7f8c8000e9d0/subr ] [perdir /home/web/centrometeo.com/] applying pattern '.*' to uri 'index.php', referer: http://www.centrometeo.com/modeq elli-numerici/modello-wrf-nmm/ 559-wrf-mslp-6hprec-ita [Fri Sep 30 07:41:39.151092 2016] [rewrite:trace1] [pid 17957:tid 140241771534080] mod_rewrite.c(477): [client 151.25.206.200:38368] 151.25.206.200 - - [www.centrometeo.com/sid#55c3e3673400][rid#7f8c8000e9d0/subr ] [perdir /home/web/centrometeo.com/] pass through /home/web/centrometeo.com/indeeq x.php , referer: http://www.centrometeo.com/modelli-numerici/modello-wrf-nmm/ 559-wrf-mslp-6hprec-ita [Fri Sep 30 07:41:39.913003 2016] [core:trace4] [pid 17957:tid 140241771534080] util_script.c(567): [client 151.25.206.200:38368] Headers from script 'php.fcgi':, referer: http://www.centrometeo.com/modelli-numerici/modello-wrf-nmm/ 559-wrf-mslp-6hprec-ita [Fri Sep 30 07:41:39.913097 2016] [core:trace4] [pid 17957:tid 140241771534080] util_script.c(568): [client 151.25.206.200:38368] Status: 500 Internal Server Error, referer: http://www.centrometeo.com/modelli-numerici/modello-wrf-nmm/ 559-wrf-mslp-6hprec-ita [Fri Sep 30 07:41:39.913130 2016] [core:trace1] [pid 17957:tid 140241771534080] util_script.c(649): [client 151.25.206.200:38368] Status line from script 'php.fcgi': 500 Internal Server Error, referer: http://www.centrometeo.com/modelli-numerici/modello-wrf-nmm/ 559-wrf-mslp-6hprec-ita [Fri Sep 30 07:41:39.913162 2016] [core:trace4] [pid 17957:tid 140241771534080] util_script.c(568): [client 151.25.206.200:38368] X-Powered-By: PHP/5.6.26, referer: http://www.centrometeo.com/modelli-numerici/modello-wrf-nmm/ 559-wrf-mslp-6hprec-ita [Fri Sep 30 07:41:39.913191 2016] [core:trace4] [pid 17957:tid 140241771534080] util_script.c(568): [client 151.25.206.200:38368] P3P: CP="NOI ADM DEV PSAi COM NAV OUR OTRo STP IND DEM", referer: http://www.centrometeo.com/modelli-numerici/modello-wrf-nmm/ 559-wrf-mslp-6hprec-ita [Fri Sep 30 07:41:39.913218 2016] [core:trace4] [pid 17957:tid 140241771534080] util_script.c(568): [client 151.25.206.200:38368] Content-Encoding: gzip, referer: http://www.centrometeo.com/modelli-numerici/modello-wrf-nmm/ 559-wrf-mslp-6hprec-ita [Fri Sep 30 07:41:39.913246 2016] [core:trace4] [pid 17957:tid 140241771534080] util_script.c(568): [client 151.25.206.200:38368] Vary: Accept-Encoding, referer: http://www.centrometeo.com/modelli-numerici/modello-wrf-nmm/ 559-wrf-mslp-6hprec-ita [Fri Sep 30 07:41:39.913275 2016] [core:trace4] [pid 17957:tid 140241771534080] util_script.c(568): [client 151.25.206.200:38368] Content-type: text/html; charset=UTF-8, referer: http://www.centrometeo.com/modelli-numerici/modello-wrf-nmm/ 559-wrf-mslp-6hprec-ita [Fri Sep 30 07:41:39.913395 2016] [http:trace3] [pid 17957:tid 140241771534080] http_filters.c(1006): [client 151.25.206.200:38368] Response sent with status 500, headers:, referer: http://www.centrometeo.com/modelli-numerici/modello-wrf-nmm/ 559-wrf-mslp-6hprec-ita [Fri Sep 30 07:41:39.913432 2016] [http:trace4] [pid 17957:tid 140241771534080] http_filters.c(835): [client 151.25.206.200:38368] X-Powered-By: PHP/5.6.26, referer: http://www.centrometeo.com/modelli-numerici/modello-wrf-nmm/ 559-wrf-mslp-6hprec-ita [Fri Sep 30 07:41:39.913458 2016] [http:trace4] [pid 17957:tid 140241771534080] http_filters.c(835): [client 151.25.206.200:38368] P3P: CP=\\"NOI ADM DEV PSAi COM NAV OUR OTRo STP IND DEM\\", referer: http://www.centrometeo.com/modelli-numerici/modello-wrf-nmm/ 559-wrf-mslp-6hprec-ita [Fri Sep 30 07:41:39.913481 2016] [http:trace4] [pid 17957:tid 140241771534080] http_filters.c(835): [client 151.25.206.200:38368] Content-Encoding: gzip, referer: http://www.centrometeo.com/modelli-numerici/modello-wrf-nmm/ 559-wrf-mslp-6hprec-ita [...]
Note the use of fcgi. Apache is indeed configured with event/php-fpm/FastCGI, but the same thing happens with "normal" prefork / php!
Until a few days ago it did not happen, it started after updating the system (kernel, apache, php, mariadb).I do not really know what to do.
Thank you very much.Fabio.
Daniel FerradalIT Specialist
email dferradal at gmail.comlinkedin es.linkedin.com/in/danielferradal
--
Daniel FerradalIT Specialist
email dferradal at gmail.comlinkedin es.linkedin.com/in/danielferradal