I have a cluster of around 630 OSDs with 3 dedicated monitors and 2
dedicated gateways. The entire cluster is running hammer (0.94.5
(9764da52395923e0b32908d83a9f7304401fee43)).
(Both of my gateways have stopped responding to curl right now.
root@host:~# timeout 5 curl localhost ; echo $?
124
From here I checked and it looks like radosgw has over 1 million open
files:
root@host:~# grep -i rados whatisopen.files.list | wc -l
1151753
And around 750 open connections:
root@host:~# netstat -planet | grep radosgw | wc -l
752
root@host:~# ss -tnlap | grep rados | wc -l
752
I don't think that the backend storage is hanging based on the following
dump:
root@host:~# ceph daemon /var/run/ceph/ceph-client.rgw.kh11-9.asok
objecter_requests | grep -i mtime
"mtime": "0.000000",
"mtime": "0.000000",
"mtime": "0.000000",
"mtime": "0.000000",
"mtime": "0.000000",
"mtime": "0.000000",
[...]
"mtime": "0.000000",
The radosgw log is still showing lots of activity and so does strace
which makes me think this is a config issue or limit of some kind that
is not triggering a log. Of what I am not sure as the log doesn't seem
to show any open file limit being hit and I don't see any big errors
showing up in the logs.
(last 500 lines of /var/log/radosgw/client.radosgw.log)
http://pastebin.com/jmM1GFSA
Perf dump of radosgw
http://pastebin.com/rjfqkxzE
Radosgw objecter requests:
http://pastebin.com/skDJiyHb
After restarting the gateway with '/etc/init.d/radosgw restart' the old
process remains, no error is sent, and then I get connection refused via
curl or netcat::
root@kh11-9:~# curl localhost
curl: (7) Failed to connect to localhost port 80: Connection refused
Once I kill the old radosgw via sigkill the new radosgw instance
restarts automatically and starts responding::
root@kh11-9:~# curl localhost
<?xml version="1.0" encoding="UTF-8"?><ListAllMyBucketsResult
xmlns="http://s3.amazonaws.com/doc/2006-03-01/"><Owner><ID>anonymous</ID><DisplayName></DisplayName></Owner><Buckets></Buckets></ListAllMyB
What is going on here?
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com