Re: [RadosGW] FastCGI: comm with server "/var/www/s3gw.fcgi" aborted: read failed

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Dec 26, 2013 at 3:00 AM, Kuo Hugo <tonytkdk@xxxxxxxxx> wrote:
> Hi all,
>
>
> I think the FastCGI module is the latest one on my server.
>
> root@p01:/var/log# dpkg -l | grep cgi
>
> ii  libapache2-mod-fastcgi            2.4.7~0910052141-2~bpo70+1.ceph
> Apache 2 FastCGI module for long-running CGI scripts
> ii  libfcgi0ldbl                      2.4.0-8.1
> Shared library of FastCGI
> ii  python-scgi                       1.13-1ubuntu1
> Server-side implementation of the SCGI protocol
>
>
> 1) It happens in higher concurrency( 990+) test.  The failed ratio about
> 10%. It never happened for concurrency under 960.
>
> Concurrency: 990
> Count:  8974 ( 1026 error;     0 retries:  0.00%)  Average requests per
> second: 669.3
>
>
>
> 2) Client tool get 500 internal sever Error from failed request. No relevant
> request log in radosgw.log. I think the External Fast CGI server did not get
> the request from apache. Does that mean the single Radosgw process has a
> limitation on 1000 concurrency connections.  No any interesting log in both
> syslog and kern.log.  The CPU loading approximately 50%.

No, it doesn't. It means that you have some issue in your environment.
Could be some kind of limit (max fds, apache concurrent connections,
socket backlog). There's a good chance you're hitting a problem with
the libfcgi module that used to use select() instead of poll() and was
breaking when fd number was greater than 1024. A newer version that
fixes it exists for ubuntu (try 2.4.0-8.1ubuntu3).

>
> ClientException: Object PUT failed:
> http://192.168.2.51:80/swift/v1/ssbench_000072/1KB_002058 500 Internal
> Server Error  [first 60 chars of response] <!DOCTYPE HTML PUBLIC
> "-//IETF//DTD HTML 2.0//EN">
> <html><he
>
> access:192.168.2.40 - - [26/Dec/2013:02:26:09 -0800] "PUT
> /swift/v1/ssbench_000072/1KB_002058 HTTP/1.1" 500 745 "-" "-"
> err:[Thu Dec 26 02:26:09 2013] [warn] FastCGI: 192.168.2.40 PUT
> http://192.168.2.51/swift/v1/ssbench_000072/1KB_002058 auth
> err:[Thu Dec 26 02:26:09 2013] [error] [client 192.168.2.40] (104)Connection
> reset by peer: FastCGI: comm with server "/var/www/s3gw.fcgi" aborted: read
> failed
> err:[Thu Dec 26 02:26:09 2013] [error] [client 192.168.2.40] FastCGI:
> incomplete headers (0 bytes) received from server "/var/www/s3gw.fcgi"
>
>
>
>
> 3) No any wait in OSD or RGW's perf dump
>
> 4) Am I in the wrong Fastcgi module?

Don't think so, otherwise all PUTs would have failed.

Yehuda

>
> Good news is that the RadosGW can handle 500+ concurrency now. But I believe
> it can get better than 900+. The CPU loading is still low tho.
>
> Appreciate ~
>
>
> 2013/12/26 Yehuda Sadeh <yehuda@xxxxxxxxxxx>
>>
>> On Wed, Dec 25, 2013 at 9:12 AM, Kuo Hugo <tonytkdk@xxxxxxxxx> wrote:
>> > Hi folks,
>> >
>> > I'm in progress to tune the performance of RadosGW on my server. After
>> > some
>> > kindly helps from you guys. I figure out several problems for optimizing
>> > the
>> > RadosGW to handle higher concurrency requests from users.
>> >
>> > Apache optimization #
>> > radosgw open file #
>> > rgw thread pools #
>> > rgw_ops throttle #
>> > objecter_inflight_op_bytes
>> > objecter_inflight_ops
>> > etc....
>> >
>> > It's a powerful sever with 32 CPU threads + 62GB Ram. But I'm encounter
>> > a
>> > problem that there's no any clue from admin sockets.
>> >
>> > What's the meaning of the following FastCGI error in Apache's error.log
>> > ? It
>> > happened on both PUT and DELETE request.
>> > No any op wait in OSD or RadosGW. How to improve it by any chance ?
>> >
>> > I'm not sure the connection reset was raised by apache or FastCGI now.
>> >
>> >  [warn] FastCGI: 192.168.2.40 PUT
>> > http://192.168.2.51/swift/v1/ssbench_000045/1KB_025787 auth
>> >  [error] [client 192.168.2.40]
>> >  [error] [client 192.168.2.40] (104)Connection reset by peer: FastCGI:
>> > comm
>> > with server "/var/www/s3gw.fcgi" aborted: read failed
>> >  [error] [client 192.168.2.40] FastCGI: incomplete headers (0 bytes)
>> > received from server "/var/www/s3gw.fcgi"
>> >  [warn] FastCGI: 192.168.2.40 PUT
>> > http://192.168.2.51/swift/v1/ssbench_000040/1KB_025788 auth
>> >  [warn] FastCGI: 192.168.2.40 PUT
>> > http://192.168.2.51/swift/v1/ssbench_000021/1KB_025685 auth
>> >  [warn] FastCGI: 192.168.2.40 PUT
>> > http://192.168.2.51/swift/v1/ssbench_000047/1KB_025790 auth
>> >  [error] [client 192.168.2.40] (104)Connection reset by peer: FastCGI:
>> > comm
>> > with server "/var/www/s3gw.fcgi" aborted: read failed
>> >  [error] [client 192.168.2.40] FastCGI: incomplete headers (0 bytes)
>> > received from server "/var/www/s3gw.fcgi"
>> >
>> >  [warn] FastCGI: 192.168.2.40 DELETE
>> > http://192.168.2.51/swift/v1/ssbench_000006/1KB_012286 auth
>> >  [error] [client 192.168.2.40] (104)Connection reset by peer: FastCGI:
>> > comm
>> > with server "/var/www/s3gw.fcgi" aborted: read failed
>> >  [error] [client 192.168.2.40] FastCGI: incomplete headers (0 bytes)
>> > received from server "/var/www/s3gw.fcgi"
>> >  [warn] FastCGI: 192.168.2.40 DELETE
>> > http://192.168.2.51/swift/v1/ssbench_000061/1KB_012168 auth
>> >
>> >  [error] [client 192.168.2.40] (104)Connection reset by peer: FastCGI:
>> > comm
>> > with server "/var/www/s3gw.fcgi" aborted: read failed
>> >
>> >
>> >
>>
>>
>> Can you correlate these with the apache access log and with the
>> radosgw log? (e.g., do you get 500 responses?). It could happen if
>> you're using the wrong fastcgi module, or if the requests are too slow
>> to respond and apache is timing out.
>>
>> Yehuda
>
>
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux