Re: Blocked ops, OSD consuming memory, hammer

Heath Albritton <halbritt@xxxxxxxx> · Wed, 25 May 2016 13:25:27 -0700

I fear I've hit a bug as well.  Considering an upgrade to the latest release of hammer.  Somewhat concerned that I may lose those PGs.

-H

> On May 25, 2016, at 07:42, Gregory Farnum <gfarnum@xxxxxxxxxx> wrote:
> 
>> On Tue, May 24, 2016 at 11:19 PM, Heath Albritton <halbritt@xxxxxxxx> wrote:
>> Not going to attempt threading and apologies for the two messages on
>> the same topic.  Christian is right, though.  3 nodes per tier, 8 SSDs
>> per node in the cache tier, 12 spinning disks in the cold tier.  10GE
>> client network with a separate 10GE back side network.  Each node in
>> the cold tier has two Intel P3700 SSDs as a journal.  This setup has
>> yielded excellent performance over the past year.
>> 
>> The memory exhaustion comes purely from one errant OSD process.  All
>> the remaining processes look fairly normal in terms of memory
>> consumption.
>> 
>> These nodes aren't particularly busy.  A random sampling shows a few
>> hundred kilobytes of data being written and very few reads.
>> 
>> Thus far, I've done quite a bit of juggling of OSDs.  Setting the
>> cluster to noup.  Restarting the failed ones, letting them get to the
>> current map and then clearing the noup flag and letting them rejoin.
>> Eventually, they'll fail again and then a fairly intense recovery
>> happens.
>> 
>> here's ceph -s:
>> 
>> https://dl.dropboxusercontent.com/u/90634073/ceph/ceph_dash_ess.txt
>> 
>> Cluster has been in this state for a while.  There are 3 PGs that seem
>> to be problematic:
>> 
>> [root@t2-node01 ~]# pg dump | grep recovering
>> -bash: pg: command not found
>> [root@t2-node01 ~]# ceph pg dump | grep recovering
>> dumped all in format plain
>> 9.2f1 1353 1075 4578 1353 1075 9114357760 2611 2611
>> active+recovering+degraded+remapped 2016-05-24 21:49:26.766924
>> 8577'2611 8642:84 [15,31] 15 [15,31,0] 15 5123'2483 2016-05-23
>> 23:52:54.360710 5123'2483 2016-05-23 23:52:54.360710
>> 12.258 878 875 2628 0 0 4414509568 1534 1534
>> active+recovering+undersized+degraded 2016-05-24 21:47:48.085476
>> 4261'1534 8587:17712 [4,20] 4 [4,20] 4 4261'1534 2016-05-23
>> 07:22:44.819208 4261'1534 2016-05-23 07:22:44.819208
>> 11.58 376 0 1 2223 0 1593129984 4909 4909
>> active+recovering+degraded+remapped 2016-05-24 05:49:07.531198
>> 8642'409248 8642:406269 [56,49,41] 56 [40,48,62] 40 4261'406995
>> 2016-05-22 21:40:40.205540 4261'406450 2016-05-21 21:37:35.497307
>> 
>> pg 9.2f1 query:
>> https://dl.dropboxusercontent.com/u/90634073/ceph/pg_9.21f.txt
>> 
>> When I query 12.258 it just hangs
>> 
>> pg 11.58 query:
>> https://dl.dropboxusercontent.com/u/90634073/ceph/pg_11.58.txt
> 
> Well, you've clearly had some things go very wrong. That "undersized"
> means that the pg doesn't have enough copies to be allowed to process
> writes, and I'm a little confused that it's also marked active but I
> don't quite remember the PG state diagrams involved. You should
> consider it down; it should be trying to recover itself though. I'm
> not quite certain if the query is considered an operation it's not
> allowed to service (which the RADOS team will need to fix, if it's not
> done already in later releases) or if the query hanging is indicative
> of yet another problem.
> 
> The memory expansion is probably operations incoming on some of those
> missing objects, or on the PG which can't take writes (but is trying
> to recover itself to a state where it *can*). In general it shouldn't
> be enough to exhaust the memory in the system, but you might have
> mis-tuned things so that clients are allowed to use up a lot more
> memory than is appropriate, or there might be a bug in v0.94.5.
> -Greg
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com