Re: [RadosGW] FastCGI: comm with server "/var/www/s3gw.fcgi" aborted: read failed

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi all, 


I think the FastCGI module is the latest one on my server.
root@p01:/var/log# dpkg -l | grep cgi
ii  libapache2-mod-fastcgi            2.4.7~0910052141-2~bpo70+1.ceph   Apache 2 FastCGI module for long-running CGI scripts
ii  libfcgi0ldbl                      2.4.0-8.1                         Shared library of FastCGI
ii  python-scgi                       1.13-1ubuntu1                     Server-side implementation of the SCGI protocol

1) It happens in higher concurrency( 990+) test.  The failed ratio about 10%. It never happened for concurrency under 960.
Concurrency: 990  
Count:  8974 ( 1026 error;     0 retries:  0.00%)  Average requests per second: 669.3


2) Client tool get 500 internal sever Error from failed request. No relevant request log in radosgw.log. I think the External Fast CGI server did not get the request from apache. Does that mean the single Radosgw process has a limitation on 1000 concurrency connections.  No any interesting log in both syslog and kern.log.  The CPU loading approximately 50%. 
ClientException: Object PUT failed: http://192.168.2.51:80/swift/v1/ssbench_000072/1KB_002058 500 Internal Server Error  [first 60 chars of response] <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><he

access:192.168.2.40 - - [26/Dec/2013:02:26:09 -0800] "PUT /swift/v1/ssbench_000072/1KB_002058 HTTP/1.1" 500 745 "-" "-"
err:[Thu Dec 26 02:26:09 2013] [warn] FastCGI: 192.168.2.40 PUT http://192.168.2.51/swift/v1/ssbench_000072/1KB_002058 auth
err:[Thu Dec 26 02:26:09 2013] [error] [client 192.168.2.40] (104)Connection reset by peer: FastCGI: comm with server "/var/www/s3gw.fcgi" aborted: read failed
err:[Thu Dec 26 02:26:09 2013] [error] [client 192.168.2.40] FastCGI: incomplete headers (0 bytes) received from server "/var/www/s3gw.fcgi"



3) No any wait in OSD or RGW's perf dump

4) Am I in the wrong Fastcgi module? 

Good news is that the RadosGW can handle 500+ concurrency now. But I believe it can get better than 900+. The CPU loading is still low tho. 

Appreciate ~ 


2013/12/26 Yehuda Sadeh <yehuda@xxxxxxxxxxx>
On Wed, Dec 25, 2013 at 9:12 AM, Kuo Hugo <tonytkdk@xxxxxxxxx> wrote:
> Hi folks,
>
> I'm in progress to tune the performance of RadosGW on my server. After some
> kindly helps from you guys. I figure out several problems for optimizing the
> RadosGW to handle higher concurrency requests from users.
>
> Apache optimization #
> radosgw open file #
> rgw thread pools #
> rgw_ops throttle #
> objecter_inflight_op_bytes
> objecter_inflight_ops
> etc....
>
> It's a powerful sever with 32 CPU threads + 62GB Ram. But I'm encounter a
> problem that there's no any clue from admin sockets.
>
> What's the meaning of the following FastCGI error in Apache's error.log ? It
> happened on both PUT and DELETE request.
> No any op wait in OSD or RadosGW. How to improve it by any chance ?
>
> I'm not sure the connection reset was raised by apache or FastCGI now.
>
>  [warn] FastCGI: 192.168.2.40 PUT
> http://192.168.2.51/swift/v1/ssbench_000045/1KB_025787 auth
>  [error] [client 192.168.2.40]
>  [error] [client 192.168.2.40] (104)Connection reset by peer: FastCGI: comm
> with server "/var/www/s3gw.fcgi" aborted: read failed
>  [error] [client 192.168.2.40] FastCGI: incomplete headers (0 bytes)
> received from server "/var/www/s3gw.fcgi"
>  [warn] FastCGI: 192.168.2.40 PUT
> http://192.168.2.51/swift/v1/ssbench_000040/1KB_025788 auth
>  [warn] FastCGI: 192.168.2.40 PUT
> http://192.168.2.51/swift/v1/ssbench_000021/1KB_025685 auth
>  [warn] FastCGI: 192.168.2.40 PUT
> http://192.168.2.51/swift/v1/ssbench_000047/1KB_025790 auth
>  [error] [client 192.168.2.40] (104)Connection reset by peer: FastCGI: comm
> with server "/var/www/s3gw.fcgi" aborted: read failed
>  [error] [client 192.168.2.40] FastCGI: incomplete headers (0 bytes)
> received from server "/var/www/s3gw.fcgi"
>
>  [warn] FastCGI: 192.168.2.40 DELETE
> http://192.168.2.51/swift/v1/ssbench_000006/1KB_012286 auth
>  [error] [client 192.168.2.40] (104)Connection reset by peer: FastCGI: comm
> with server "/var/www/s3gw.fcgi" aborted: read failed
>  [error] [client 192.168.2.40] FastCGI: incomplete headers (0 bytes)
> received from server "/var/www/s3gw.fcgi"
>  [warn] FastCGI: 192.168.2.40 DELETE
> http://192.168.2.51/swift/v1/ssbench_000061/1KB_012168 auth
>
>  [error] [client 192.168.2.40] (104)Connection reset by peer: FastCGI: comm
> with server "/var/www/s3gw.fcgi" aborted: read failed
>
>
>


Can you correlate these with the apache access log and with the
radosgw log? (e.g., do you get 500 responses?). It could happen if
you're using the wrong fastcgi module, or if the requests are too slow
to respond and apache is timing out.

Yehuda

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux