Hi, we have a setup of 4 Servers running ceph and radosgw. We use it as an internal S3 service for our files. The Servers run Debian Squeeze with Ceph 0.67.4. The cluster has been running smoothly for quite a while, but we are currently experiencing issues with the radosgw. For some files the HTTP Download just stalls at around 500kb. The Apache error log just says: [error] [client ] FastCGI: comm with server "/var/www/s3gw.fcgi" aborted: idle timeout (30 sec) [error] [client ] Handler for fastcgi-script returned invalid result code 1 radosgw logging: 7f00bc66a700 1 heartbeat_map is_healthy 'RGWProcess::m_tp thread 0x7f00934bb700' had timed out after 600 7f00bc66a700 1 heartbeat_map is_healthy 'RGWProcess::m_tp thread 0x7f00ab4eb700' had timed out after 600 The interesting thing is that the cluster health is fine an only some files are not working properly. Most of them just work fine. A restart of radosgw fixes the issue. The other ceph logs are also clean. Any idea why this happens? Sebastian _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com