Hi all,
I think the FastCGI module is the latest one on my server.
root@p01:/var/log# dpkg -l | grep cgi
ii libapache2-mod-fastcgi 2.4.7~0910052141-2~bpo70+1.ceph Apache 2 FastCGI module for long-running CGI scriptsii libfcgi0ldbl 2.4.0-8.1 Shared library of FastCGIii python-scgi 1.13-1ubuntu1 Server-side implementation of the SCGI protocol
1) It happens in higher concurrency( 990+) test. The failed ratio about 10%. It never happened for concurrency under 960.
Concurrency: 990
Count: 8974 ( 1026 error; 0 retries: 0.00%) Average requests per second: 669.3
2) Client tool get 500 internal sever Error from failed request. No relevant request log in radosgw.log. I think the External Fast CGI server did not get the request from apache. Does that mean the single Radosgw process has a limitation on 1000 concurrency connections. No any interesting log in both syslog and kern.log. The CPU loading approximately 50%.
ClientException: Object PUT failed: http://192.168.2.51:80/swift/v1/ssbench_000072/1KB_002058 500 Internal Server Error [first 60 chars of response] <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN"><html><heaccess:192.168.2.40 - - [26/Dec/2013:02:26:09 -0800] "PUT /swift/v1/ssbench_000072/1KB_002058 HTTP/1.1" 500 745 "-" "-"err:[Thu Dec 26 02:26:09 2013] [warn] FastCGI: 192.168.2.40 PUT http://192.168.2.51/swift/v1/ssbench_000072/1KB_002058 autherr:[Thu Dec 26 02:26:09 2013] [error] [client 192.168.2.40] (104)Connection reset by peer: FastCGI: comm with server "/var/www/s3gw.fcgi" aborted: read failederr:[Thu Dec 26 02:26:09 2013] [error] [client 192.168.2.40] FastCGI: incomplete headers (0 bytes) received from server "/var/www/s3gw.fcgi"
3) No any wait in OSD or RGW's perf dump
4) Am I in the wrong Fastcgi module?
Good news is that the RadosGW can handle 500+ concurrency now. But I believe it can get better than 900+. The CPU loading is still low tho.
Appreciate ~
2013/12/26 Yehuda Sadeh <yehuda@xxxxxxxxxxx>
Can you correlate these with the apache access log and with theOn Wed, Dec 25, 2013 at 9:12 AM, Kuo Hugo <tonytkdk@xxxxxxxxx> wrote:
> Hi folks,
>
> I'm in progress to tune the performance of RadosGW on my server. After some
> kindly helps from you guys. I figure out several problems for optimizing the
> RadosGW to handle higher concurrency requests from users.
>
> Apache optimization #
> radosgw open file #
> rgw thread pools #
> rgw_ops throttle #
> objecter_inflight_op_bytes
> objecter_inflight_ops
> etc....
>
> It's a powerful sever with 32 CPU threads + 62GB Ram. But I'm encounter a
> problem that there's no any clue from admin sockets.
>
> What's the meaning of the following FastCGI error in Apache's error.log ? It
> happened on both PUT and DELETE request.
> No any op wait in OSD or RadosGW. How to improve it by any chance ?
>
> I'm not sure the connection reset was raised by apache or FastCGI now.
>
> [warn] FastCGI: 192.168.2.40 PUT
> http://192.168.2.51/swift/v1/ssbench_000045/1KB_025787 auth
> [error] [client 192.168.2.40]
> [error] [client 192.168.2.40] (104)Connection reset by peer: FastCGI: comm
> with server "/var/www/s3gw.fcgi" aborted: read failed
> [error] [client 192.168.2.40] FastCGI: incomplete headers (0 bytes)
> received from server "/var/www/s3gw.fcgi"
> [warn] FastCGI: 192.168.2.40 PUT
> http://192.168.2.51/swift/v1/ssbench_000040/1KB_025788 auth
> [warn] FastCGI: 192.168.2.40 PUT
> http://192.168.2.51/swift/v1/ssbench_000021/1KB_025685 auth
> [warn] FastCGI: 192.168.2.40 PUT
> http://192.168.2.51/swift/v1/ssbench_000047/1KB_025790 auth
> [error] [client 192.168.2.40] (104)Connection reset by peer: FastCGI: comm
> with server "/var/www/s3gw.fcgi" aborted: read failed
> [error] [client 192.168.2.40] FastCGI: incomplete headers (0 bytes)
> received from server "/var/www/s3gw.fcgi"
>
> [warn] FastCGI: 192.168.2.40 DELETE
> http://192.168.2.51/swift/v1/ssbench_000006/1KB_012286 auth
> [error] [client 192.168.2.40] (104)Connection reset by peer: FastCGI: comm
> with server "/var/www/s3gw.fcgi" aborted: read failed
> [error] [client 192.168.2.40] FastCGI: incomplete headers (0 bytes)
> received from server "/var/www/s3gw.fcgi"
> [warn] FastCGI: 192.168.2.40 DELETE
> http://192.168.2.51/swift/v1/ssbench_000061/1KB_012168 auth
>
> [error] [client 192.168.2.40] (104)Connection reset by peer: FastCGI: comm
> with server "/var/www/s3gw.fcgi" aborted: read failed
>
>
>
radosgw log? (e.g., do you get 500 responses?). It could happen if
you're using the wrong fastcgi module, or if the requests are too slow
to respond and apache is timing out.
Yehuda
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com