Re: RADOS Gateway Issues

Graeme Lambert <glambert@xxxxxxxxxxx> · Wed, 22 Jan 2014 16:38:29 +0000

    Hi,

      Following discussions with people in the IRC I set debug_ms and
      this is what is being looped over and over when one of them is
      stuck:

      http://pastebin.com/KVcpAeYT

      Regarding the modules, apache version is 2.2.22-2precise.ceph and
      the fastcgi mod version is 2.4.7~0910052141-2~bpo70+1.ceph.

        Best regards
        Graeme  

      On 22/01/14 16:28, Yehuda Sadeh wrote:

      On Wed, Jan 22, 2014 at 8:05 AM, Graeme Lambert <glambert@xxxxxxxxxxx> wrote:

        Hi,

I'm using the aws-sdk-for-php classes for the Ceph RADOS gateway but I'm
getting an intermittent issue with the uploading files.

I'm attempting to upload an array of objects to Ceph one by one using the
create_object() function.  It appears to stop randomly when attempting to do
them all, it could stop at the first one, in between or the last one, there
is no pattern to it that I can see.

I'm not getting any PHP errors that indicate an issue and equally there are
no exceptions being caught.

In the radosgw log file, at the time it appears stuck I get:

2014-01-22 15:39:21.656763 7fac44fe1700  1 ====== starting new request
req=0x2417c30 =====

And then sometimes I see:

2014-01-22 15:40:42.490485 7fac99ff9700  1 heartbeat_map is_healthy
'RGWProcess::m_tp thread 0x7fac51ffb700' had timed out after 600

repeated over and over again.

When those messages are appearing, Apache's error log shows:

[Wed Jan 22 15:43:11 2014] [error] [client 172.16.2.149] FastCGI: comm with
server "/var/www/s3gw.fcgi" aborted: idle timeout (30 sec), referer:
https://[server]/[path]

equally over and over again.

I have restarted apache, radosgw, all Ceph OSDs and ceph-mon processes and
still no joy with this.

Can anyone advise on where I'm going wrong with this?

      Which fastcgi module are you using? Can you provide a log with 'debug
ms = 1' for a failing request? Usually that kind of message means that
it's waiting for the osd to response, which might point at an
unhealthy cluster.

Yehuda

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com