Yehuda Sadeh <yehuda@xxxxxxxxxxx> wrote:
On Wed, Jan 22, 2014 at 8:55 AM, Graeme Lambert <glambert@xxxxxxxxxxx> wrote:Hi Yehuda,
With regards to the health status of the cluster, it isn't healthy but I
haven't found any way of fixing the placement group errors. Looking at the
ceph health detail it's also showing blocked requests too?
HEALTH_WARN 1 pgs down; 3 pgs incomplete; 3 pgs stuck inactive; 3 pgs stuck
unclean; 7 requests are blocked > 32 sec; 3 osds have slow requests; pool
cloudstack has too few pgs; pool .rgw.buckets has too few pgs
pg 14.0 is stuck inactive since forever, current state incomplete, last
acting [5,0]
pg 14.2 is stuck inactive since forever, current state incomplete, last
acting [0,5]
pg 14.6 is stuck inactive since forever, current state down+incomplete, last
acting [4, 2]
pg 14.0 is stuck unclean since forever, current state incomplete, last
acting [5,0]
pg 14.2 is stuck unclean since forever, current state incomplete, last
acting [0,5]
pg 14.6 is stuck unclean since forever, current state down+incomplete, last
acting [4,2]
pg 14.0 is incomplete, acting [5,0]
pg 14.2 is incomplete, acting [0,5]
pg 14.6 is down+incomplete, acting [4,2]
You should figure these first before trying to get the gateway
working. May very well be your culprit.
Yehuda3 ops are blocked > 8388.61 sec
3 ops are blocked > 4194.3 sec
1 ops are blocked > 2097.15 sec
1 ops are blocked > 8388.61 sec on osd.0
1 ops are blocked > 4194.3 sec on osd.0
2 ops are blocked > 8388.61 sec on osd.4
2 ops are blocked > 4194.3 sec on osd.5
1 ops are blocked > 2097.15 sec on osd.5
3 osds have slow requests
pool cloudstack objects per pg (37316) is more than 27.1587 times cluster
average (1374)
pool .rgw.buckets objects per pg (76219) is more than 55.4723 times cluster
average (1374)
Ignore the cloudstack pool, I was using cloudstack but not anymore, it's an
inactive pool.
Best regards
Graeme
On 22/01/14 16:38, Graeme Lambert wrote:
Hi,
Following discussions with people in the IRC I set debug_ms and this is what
is being looped over and over when one of them is stuck:
http://pastebin.com/KVcpAeYT
Regarding the modules, apache version is 2.2.22-2precise.ceph and the
fastcgi mod version is 2.4.7~0910052141-2~bpo70+1.ceph.
Best regards
Graeme
On 22/01/14 16:28, Yehuda Sadeh wrote:
On Wed, Jan 22, 2014 at 8:05 AM, Graeme Lambert <glambert@xxxxxxxxxxx>
wrote:
Hi,
I'm using the aws-sdk-for-php classes for the Ceph RADOS gateway but I'm
getting an intermittent issue with the uploading files.
I'm attempting to upload an array of objects to Ceph one by one using the
create_object() function. It appears to stop randomly when attempting to do
them all, it could stop at the first one, in between or the last one, there
is no pattern to it that I can see.
I'm not getting any PHP errors that indicate an issue and equally there are
no exceptions being caught.
In the radosgw log file, at the time it appears stuck I get:
2014-01-22 15:39:21.656763 7fac44fe1700 1 ====== starting new request
req=0x2417c30 =====
And then sometimes I see:
2014-01-22 15:40:42.490485 7fac99ff9700 1 heartbeat_map is_healthy
'RGWProcess::m_tp thread 0x7fac51ffb700' had timed out after 600
repeated over and over again.
When those messages are appearing, Apache's error log shows:
[Wed Jan 22 15:43:11 2014] [error] [client 172.16.2.149] FastCGI: comm with
server "/var/www/s3gw.fcgi" aborted: idle timeout (30 sec), referer:
https://[server]/[path]
equally over and over again.
I have restarted apache, radosgw, all Ceph OSDs and ceph-mon processes and
still no joy with this.
Can anyone advise on where I'm going wrong with this?
Which fastcgi module are you using? Can you provide a log with 'debug
ms = 1' for a failing request? Usually that kind of message means that
it's waiting for the osd to response, which might point at an
unhealthy cluster.
Yehuda
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com