On Wed, Nov 27, 2013 at 4:46 AM, Sebastian <webmaster@xxxxxxxx> wrote: > Hi, > > we have a setup of 4 Servers running ceph and radosgw. We use it as an internal S3 service for our files. The Servers run Debian Squeeze with Ceph 0.67.4. > > The cluster has been running smoothly for quite a while, but we are currently experiencing issues with the radosgw. For some files the HTTP Download just stalls at around 500kb. > > The Apache error log just says: > [error] [client ] FastCGI: comm with server "/var/www/s3gw.fcgi" aborted: idle timeout (30 sec) > [error] [client ] Handler for fastcgi-script returned invalid result code 1 > > radosgw logging: > 7f00bc66a700 1 heartbeat_map is_healthy 'RGWProcess::m_tp thread 0x7f00934bb700' had timed out after 600 > 7f00bc66a700 1 heartbeat_map is_healthy 'RGWProcess::m_tp thread 0x7f00ab4eb700' had timed out after 600 > > The interesting thing is that the cluster health is fine an only some files are not working properly. Most of them just work fine. A restart of radosgw fixes the issue. The other ceph logs are also clean. > > Any idea why this happens? > No, but you can turn on 'debug ms = 1' on your gateway ceph.conf, and that might give some better indication. Yehuda _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com