Re: Blocked ops, OSD consuming memory, hammer

Shinobu Kinjo <shinobu.kj@xxxxxxxxx> · Thu, 26 May 2016 07:26:19 +0900

What will the followings show you?

ceph pg 12.258 list_unfound  // maybe hung...
ceph pg dump_stuck

and enable debug to osd.4

debug osd = 20
debug filestore = 20
debug ms = 1

But honestly my best bet is to upgrade to the latest. It would save
your life much more.

 - Shinobu

On Thu, May 26, 2016 at 5:25 AM, Heath Albritton <halbritt@xxxxxxxx> wrote:
> I fear I've hit a bug as well.  Considering an upgrade to the latest release of hammer.  Somewhat concerned that I may lose those PGs.
>
>
> -H
>
>> On May 25, 2016, at 07:42, Gregory Farnum <gfarnum@xxxxxxxxxx> wrote:
>>
>>> On Tue, May 24, 2016 at 11:19 PM, Heath Albritton <halbritt@xxxxxxxx> wrote:
>>> Not going to attempt threading and apologies for the two messages on
>>> the same topic.  Christian is right, though.  3 nodes per tier, 8 SSDs
>>> per node in the cache tier, 12 spinning disks in the cold tier.  10GE
>>> client network with a separate 10GE back side network.  Each node in
>>> the cold tier has two Intel P3700 SSDs as a journal.  This setup has
>>> yielded excellent performance over the past year.
>>>
>>> The memory exhaustion comes purely from one errant OSD process.  All
>>> the remaining processes look fairly normal in terms of memory
>>> consumption.
>>>
>>> These nodes aren't particularly busy.  A random sampling shows a few
>>> hundred kilobytes of data being written and very few reads.
>>>
>>> Thus far, I've done quite a bit of juggling of OSDs.  Setting the
>>> cluster to noup.  Restarting the failed ones, letting them get to the
>>> current map and then clearing the noup flag and letting them rejoin.
>>> Eventually, they'll fail again and then a fairly intense recovery
>>> happens.
>>>
>>> here's ceph -s:
>>>
>>> https://dl.dropboxusercontent.com/u/90634073/ceph/ceph_dash_ess.txt
>>>
>>> Cluster has been in this state for a while.  There are 3 PGs that seem
>>> to be problematic:
>>>
>>> [root@t2-node01 ~]# pg dump | grep recovering
>>> -bash: pg: command not found
>>> [root@t2-node01 ~]# ceph pg dump | grep recovering
>>> dumped all in format plain
>>> 9.2f1 1353 1075 4578 1353 1075 9114357760 2611 2611
>>> active+recovering+degraded+remapped 2016-05-24 21:49:26.766924
>>> 8577'2611 8642:84 [15,31] 15 [15,31,0] 15 5123'2483 2016-05-23
>>> 23:52:54.360710 5123'2483 2016-05-23 23:52:54.360710
>>> 12.258 878 875 2628 0 0 4414509568 1534 1534
>>> active+recovering+undersized+degraded 2016-05-24 21:47:48.085476
>>> 4261'1534 8587:17712 [4,20] 4 [4,20] 4 4261'1534 2016-05-23
>>> 07:22:44.819208 4261'1534 2016-05-23 07:22:44.819208
>>> 11.58 376 0 1 2223 0 1593129984 4909 4909
>>> active+recovering+degraded+remapped 2016-05-24 05:49:07.531198
>>> 8642'409248 8642:406269 [56,49,41] 56 [40,48,62] 40 4261'406995
>>> 2016-05-22 21:40:40.205540 4261'406450 2016-05-21 21:37:35.497307
>>>
>>> pg 9.2f1 query:
>>> https://dl.dropboxusercontent.com/u/90634073/ceph/pg_9.21f.txt
>>>
>>> When I query 12.258 it just hangs
>>>
>>> pg 11.58 query:
>>> https://dl.dropboxusercontent.com/u/90634073/ceph/pg_11.58.txt
>>
>> Well, you've clearly had some things go very wrong. That "undersized"
>> means that the pg doesn't have enough copies to be allowed to process
>> writes, and I'm a little confused that it's also marked active but I
>> don't quite remember the PG state diagrams involved. You should
>> consider it down; it should be trying to recover itself though. I'm
>> not quite certain if the query is considered an operation it's not
>> allowed to service (which the RADOS team will need to fix, if it's not
>> done already in later releases) or if the query hanging is indicative
>> of yet another problem.
>>
>> The memory expansion is probably operations incoming on some of those
>> missing objects, or on the PG which can't take writes (but is trying
>> to recover itself to a state where it *can*). In general it shouldn't
>> be enough to exhaust the memory in the system, but you might have
>> mis-tuned things so that clients are allowed to use up a lot more
>> memory than is appropriate, or there might be a bug in v0.94.5.
>> -Greg
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

-- 
Email:
shinobu@xxxxxxxxx
shinobu@xxxxxxxxxx
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com