Re: how to fix active+remapped pg

Ugis <ugis22@xxxxxxxxx> · Tue, 3 Dec 2013 13:13:30 +0200

Hi,
Upgaded to emperor, restarted all nodes.

Still have "31 actige+remapped" pgs.

Compared remapped and healthy pg query output - some remapped pgs do
not have data, some do, some have been scrubbed some don't. Now
running read for whole rbd - may be that would trigger those stuck
pgs.

state on remapped pgs like:
{ "state": "active+remapped",
  "epoch": 9420,
  "up": [
        9],
  "acting": [
        9,
        5],

Any help/hints how to trigger those stuck pgs to up state on 2 osds?

Ugis

2013/11/22 Ugis <ugis22@xxxxxxxxx>:
> Update: I noticed that I hadn't increased pgp_num for default data
> pool for which I increased pg_num time ago. So I did now and some
> backfilling happened.
> Now I still have "31 actige+remapped" pgs.
> Remapped pgs belong to all pools, even those where is no data.
> To me suspicious is that host ceph8 has weight 10.88(I had some osds
> there temporarily, but due to low ram I remover those)
> If that is of importance ceph7 is also low on ram(4GB) and is slower
> to respond at times than ceph5(Sage mentioned "lagging pg peering
> workqueue" in Bug#3747).
>
> Results follow:
> # ceph osd tree
> # id    weight  type name       up/down reweight
> -5      0       root slow
> -4      0               host ceph5-slow
> -1      32.46   root default
> -2      10.5            host ceph5
> 0       0.2                     osd.0   up      0
> 2       2.8                     osd.2   up      1
> 3       2.8                     osd.3   up      1
> 4       1.9                     osd.4   up      1
> 5       2.8                     osd.5   up      1
> -3      0.2             host ceph6
> 1       0.2                     osd.1   up      0
> -6      10.88           host ceph7
> 6       2.73                    osd.6   up      1
> 7       2.73                    osd.7   up      1
> 8       2.71                    osd.8   up      1
> 9       2.71                    osd.9   up      1
> -7      10.88           host ceph8
>
> # ceph osd crush dump
> { "devices": [
>         { "id": 0,
>           "name": "osd.0"},
>         { "id": 1,
>           "name": "osd.1"},
>         { "id": 2,
>           "name": "osd.2"},
>         { "id": 3,
>           "name": "osd.3"},
>         { "id": 4,
>           "name": "osd.4"},
>         { "id": 5,
>           "name": "osd.5"},
>         { "id": 6,
>           "name": "osd.6"},
>         { "id": 7,
>           "name": "osd.7"},
>         { "id": 8,
>           "name": "osd.8"},
>         { "id": 9,
>           "name": "osd.9"}],
>   "types": [
>         { "type_id": 0,
>           "name": "osd"},
>         { "type_id": 1,
>           "name": "host"},
>         { "type_id": 2,
>           "name": "rack"},
>         { "type_id": 3,
>           "name": "row"},
>         { "type_id": 4,
>           "name": "room"},
>         { "type_id": 5,
>           "name": "datacenter"},
>         { "type_id": 6,
>           "name": "root"}],
>   "buckets": [
>         { "id": -1,
>           "name": "default",
>           "type_id": 6,
>           "type_name": "root",
>           "weight": 2127297,
>           "alg": "straw",
>           "hash": "rjenkins1",
>           "items": [
>                 { "id": -2,
>                   "weight": 688128,
>                   "pos": 0},
>                 { "id": -3,
>                   "weight": 13107,
>                   "pos": 1},
>                 { "id": -6,
>                   "weight": 713031,
>                   "pos": 2},
>                 { "id": -7,
>                   "weight": 713031,
>                   "pos": 3}]},
>         { "id": -2,
>           "name": "ceph5",
>           "type_id": 1,
>           "type_name": "host",
>           "weight": 688125,
>           "alg": "straw",
>           "hash": "rjenkins1",
>           "items": [
>                 { "id": 0,
>                   "weight": 13107,
>                   "pos": 0},
>                 { "id": 2,
>                   "weight": 183500,
>                   "pos": 1},
>                 { "id": 3,
>                   "weight": 183500,
>                   "pos": 2},
>                 { "id": 4,
>                   "weight": 124518,
>                   "pos": 3},
>                 { "id": 5,
>                   "weight": 183500,
>                   "pos": 4}]},
>         { "id": -3,
>           "name": "ceph6",
>           "type_id": 1,
>           "type_name": "host",
>           "weight": 13107,
>           "alg": "straw",
>           "hash": "rjenkins1",
>           "items": [
>                 { "id": 1,
>                   "weight": 13107,
>                   "pos": 0}]},
>         { "id": -4,
>           "name": "ceph5-slow",
>           "type_id": 1,
>           "type_name": "host",
>           "weight": 0,
>           "alg": "straw",
>           "hash": "rjenkins1",
>           "items": []},
>         { "id": -5,
>           "name": "slow",
>           "type_id": 6,
>           "type_name": "root",
>           "weight": 0,
>           "alg": "straw",
>           "hash": "rjenkins1",
>           "items": [
>                 { "id": -4,
>                   "weight": 0,
>                   "pos": 0}]},
>         { "id": -6,
>           "name": "ceph7",
>           "type_id": 1,
>           "type_name": "host",
>           "weight": 713030,
>           "alg": "straw",
>           "hash": "rjenkins1",
>           "items": [
>                 { "id": 6,
>                   "weight": 178913,
>                   "pos": 0},
>                 { "id": 7,
>                   "weight": 178913,
>                   "pos": 1},
>                 { "id": 8,
>                   "weight": 177602,
>                   "pos": 2},
>                 { "id": 9,
>                   "weight": 177602,
>                   "pos": 3}]},
>         { "id": -7,
>           "name": "ceph8",
>           "type_id": 1,
>           "type_name": "host",
>           "weight": 0,
>           "alg": "straw",
>           "hash": "rjenkins1",
>           "items": []}],
>   "rules": [
>         { "rule_id": 0,
>           "rule_name": "data",
>           "ruleset": 0,
>           "type": 1,
>           "min_size": 1,
>           "max_size": 10,
>           "steps": [
>                 { "op": "take",
>                   "item": -1},
>                 { "op": "chooseleaf_firstn",
>                   "num": 0,
>                   "type": "host"},
>                 { "op": "emit"}]},
>         { "rule_id": 1,
>           "rule_name": "metadata",
>           "ruleset": 1,
>           "type": 1,
>           "min_size": 1,
>           "max_size": 10,
>           "steps": [
>                 { "op": "take",
>                   "item": -1},
>                 { "op": "chooseleaf_firstn",
>                   "num": 0,
>                   "type": "host"},
>                 { "op": "emit"}]},
>         { "rule_id": 2,
>           "rule_name": "rbd",
>           "ruleset": 2,
>           "type": 1,
>           "min_size": 1,
>           "max_size": 10,
>           "steps": [
>                 { "op": "take",
>                   "item": -1},
>                 { "op": "chooseleaf_firstn",
>                   "num": 0,
>                   "type": "host"},
>                 { "op": "emit"}]},
>         { "rule_id": 3,
>           "rule_name": "own1",
>           "ruleset": 3,
>           "type": 1,
>           "min_size": 1,
>           "max_size": 20,
>           "steps": [
>                 { "op": "take",
>                   "item": -1},
>                 { "op": "chooseleaf_firstn",
>                   "num": 0,
>                   "type": "host"},
>                 { "op": "emit"}]}],
>   "tunables": { "choose_local_tries": 0,
>       "choose_local_fallback_tries": 0,
>       "choose_total_tries": 50,
>       "chooseleaf_descend_once": 1}}
>
> Ugis
>
> 2013/11/21 John Wilkins <john.wilkins@xxxxxxxxxxx>:
>> Ugis,
>>
>> Can you provide the results for:
>>
>> ceph osd tree
>> ceph osd crush dump
>>
>>
>>
>>
>>
>>
>> On Thu, Nov 21, 2013 at 7:59 AM, Gregory Farnum <greg@xxxxxxxxxxx> wrote:
>>> On Thu, Nov 21, 2013 at 7:52 AM, Ugis <ugis22@xxxxxxxxx> wrote:
>>>> Thanks, reread that section in docs and found tunables profile - nice
>>>> to have, hadn't noticed it before(ceph docs develop so fast that you
>>>> need RSS to follow all changes :) )
>>>>
>>>> Still problem persists in a different way.
>>>> Did set profile "optimal", reballancing started, but I had "rbd
>>>> delete" in background, in the end cluster ended up with negative
>>>> degradation %
>>>> I think I have hit bug http://tracker.ceph.com/issues/3720   which is
>>>> still open.
>>>> I did restart osds one by one and negative degradation dissapeared.
>>>>
>>>> Afterwards I added extra ~900GB data, degradation growed in process to 0.071%
>>>> This is rather http://tracker.ceph.com/issues/3747  which is closed,
>>>> but seems to happen still.
>>>> I did "ceph osd out X; sleep 40; ceph osd in X" for all osds,
>>>> degradation % went away.
>>>>
>>>> In the end I still have "55 active+remapped" pgs and no degradation %.
>>>> "pgmap v1853405: 2662 pgs: 2607 active+clean, 55 active+remapped; 5361
>>>> GB data, 10743 GB used, 10852 GB / 21595 GB avail; 25230KB/s rd,
>>>> 203op/s"
>>>>
>>>> I queried some of remapped pgs, do not see why they do not
>>>> reballance(tunables are optimal now, checked).
>>>>
>>>> Where to look for the reason they are not reballancing? Is there
>>>> something to look for in osd logs if debug level is increased?
>>>>
>>>> one of those:
>>>> # ceph pg 4.5e query
>>>> { "state": "active+remapped",
>>>>   "epoch": 9165,
>>>>   "up": [
>>>>         9],
>>>>   "acting": [
>>>>         9,
>>>>         5],
>>>
>>> For some reason CRUSH is still failing to map all the PGs to two hosts
>>> (notice how the "up" set is only one OSD, so it's adding another one
>>> in "acting") — what's your CRUSH map look like?
>>> -Greg
>>> Software Engineer #42 @ http://inktank.com | http://ceph.com
>>> _______________________________________________
>>> ceph-users mailing list
>>> ceph-users@xxxxxxxxxxxxxx
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>>
>>
>> --
>> John Wilkins
>> Senior Technical Writer
>> Intank
>> john.wilkins@xxxxxxxxxxx
>> (415) 425-9599
>> http://inktank.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com