RGW Blocking on 1-2 PG's - argonaut

Sławomir Skowron <szibis@xxxxxxxxx> · Mon, 4 Mar 2013 12:02:15 +0100

Hi,

We have a big problem with RGW. I don't know what is the initial
trigger, but i have theory.

2-3 osd, from 78 in cluster (6480 PG on RGW pool), have 3x time more
RAM usage, they have much more operations in journal, and much bigger
latency.

When we PUT some objects then in some cases, there are so many
operations in triple replication on this osd (one PG). Then this
triple can't handle this load, and goes down, drives on backend of
this osd are getting fire with big wait-io, and big response times.
RGW waiting for this PG, and eventually block all the others
operations when makes 1024 operations blocked in queue.
Then whole cluster have problems, and we have an outage.

When RGW block operations there is only one PG that have >1000
operations in queue -
ceph pg map 3.9447554d
osdmap e11404 pg 3.9447554d (3.54d) -> up [53,45,23] acting [53,45,23]

now this osd are migrated, with ratio 0.5 on, but before it was

ceph pg map 3.9447554d
osdmap e11404 pg 3.9447554d (3.54d) -> up [71,45,23] acting [71,45,23]

and this three osd's have such a problems. Under this osd's are only 3
drive, one drive per osd, that's why this have such a big impact.

What i done. I gave 50% smaller ratio in CRUSH for this osd's, but
data move to other osd, and this osd, have half of possible capacity.
I think it won't help in long term, and it's not a solution.

I have second cluster, with only replication on it, and there are same
case. Attachment explain everything. Every parameter on this bad osd
is much higher than on others. There are 2-3 osd with such high
counters.

Is this a bug ?? maybe there is no problems in bobtail ?? I can't
switch quick into bobtail that's why i need some answers, which way i
need to go.

Best Regards

Slawomir Skowron
Attachment:
bad_osd.png

Description: PNG image
Attachment:
good_osds.png

Description: PNG image