CRUSH is failing to map all the PGs to the right number of OSDs. You've got a completely empty host which has ~1/3 of the cluster's total weight, and that is probably why — remove it! -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Tue, Dec 3, 2013 at 3:13 AM, Ugis <ugis22@xxxxxxxxx> wrote: > Hi, > Upgaded to emperor, restarted all nodes. > > Still have "31 actige+remapped" pgs. > > Compared remapped and healthy pg query output - some remapped pgs do > not have data, some do, some have been scrubbed some don't. Now > running read for whole rbd - may be that would trigger those stuck > pgs. > > state on remapped pgs like: > { "state": "active+remapped", > "epoch": 9420, > "up": [ > 9], > "acting": [ > 9, > 5], > > Any help/hints how to trigger those stuck pgs to up state on 2 osds? > > Ugis > > > 2013/11/22 Ugis <ugis22@xxxxxxxxx>: >> Update: I noticed that I hadn't increased pgp_num for default data >> pool for which I increased pg_num time ago. So I did now and some >> backfilling happened. >> Now I still have "31 actige+remapped" pgs. >> Remapped pgs belong to all pools, even those where is no data. >> To me suspicious is that host ceph8 has weight 10.88(I had some osds >> there temporarily, but due to low ram I remover those) >> If that is of importance ceph7 is also low on ram(4GB) and is slower >> to respond at times than ceph5(Sage mentioned "lagging pg peering >> workqueue" in Bug#3747). >> >> Results follow: >> # ceph osd tree >> # id weight type name up/down reweight >> -5 0 root slow >> -4 0 host ceph5-slow >> -1 32.46 root default >> -2 10.5 host ceph5 >> 0 0.2 osd.0 up 0 >> 2 2.8 osd.2 up 1 >> 3 2.8 osd.3 up 1 >> 4 1.9 osd.4 up 1 >> 5 2.8 osd.5 up 1 >> -3 0.2 host ceph6 >> 1 0.2 osd.1 up 0 >> -6 10.88 host ceph7 >> 6 2.73 osd.6 up 1 >> 7 2.73 osd.7 up 1 >> 8 2.71 osd.8 up 1 >> 9 2.71 osd.9 up 1 >> -7 10.88 host ceph8 >> >> # ceph osd crush dump >> { "devices": [ >> { "id": 0, >> "name": "osd.0"}, >> { "id": 1, >> "name": "osd.1"}, >> { "id": 2, >> "name": "osd.2"}, >> { "id": 3, >> "name": "osd.3"}, >> { "id": 4, >> "name": "osd.4"}, >> { "id": 5, >> "name": "osd.5"}, >> { "id": 6, >> "name": "osd.6"}, >> { "id": 7, >> "name": "osd.7"}, >> { "id": 8, >> "name": "osd.8"}, >> { "id": 9, >> "name": "osd.9"}], >> "types": [ >> { "type_id": 0, >> "name": "osd"}, >> { "type_id": 1, >> "name": "host"}, >> { "type_id": 2, >> "name": "rack"}, >> { "type_id": 3, >> "name": "row"}, >> { "type_id": 4, >> "name": "room"}, >> { "type_id": 5, >> "name": "datacenter"}, >> { "type_id": 6, >> "name": "root"}], >> "buckets": [ >> { "id": -1, >> "name": "default", >> "type_id": 6, >> "type_name": "root", >> "weight": 2127297, >> "alg": "straw", >> "hash": "rjenkins1", >> "items": [ >> { "id": -2, >> "weight": 688128, >> "pos": 0}, >> { "id": -3, >> "weight": 13107, >> "pos": 1}, >> { "id": -6, >> "weight": 713031, >> "pos": 2}, >> { "id": -7, >> "weight": 713031, >> "pos": 3}]}, >> { "id": -2, >> "name": "ceph5", >> "type_id": 1, >> "type_name": "host", >> "weight": 688125, >> "alg": "straw", >> "hash": "rjenkins1", >> "items": [ >> { "id": 0, >> "weight": 13107, >> "pos": 0}, >> { "id": 2, >> "weight": 183500, >> "pos": 1}, >> { "id": 3, >> "weight": 183500, >> "pos": 2}, >> { "id": 4, >> "weight": 124518, >> "pos": 3}, >> { "id": 5, >> "weight": 183500, >> "pos": 4}]}, >> { "id": -3, >> "name": "ceph6", >> "type_id": 1, >> "type_name": "host", >> "weight": 13107, >> "alg": "straw", >> "hash": "rjenkins1", >> "items": [ >> { "id": 1, >> "weight": 13107, >> "pos": 0}]}, >> { "id": -4, >> "name": "ceph5-slow", >> "type_id": 1, >> "type_name": "host", >> "weight": 0, >> "alg": "straw", >> "hash": "rjenkins1", >> "items": []}, >> { "id": -5, >> "name": "slow", >> "type_id": 6, >> "type_name": "root", >> "weight": 0, >> "alg": "straw", >> "hash": "rjenkins1", >> "items": [ >> { "id": -4, >> "weight": 0, >> "pos": 0}]}, >> { "id": -6, >> "name": "ceph7", >> "type_id": 1, >> "type_name": "host", >> "weight": 713030, >> "alg": "straw", >> "hash": "rjenkins1", >> "items": [ >> { "id": 6, >> "weight": 178913, >> "pos": 0}, >> { "id": 7, >> "weight": 178913, >> "pos": 1}, >> { "id": 8, >> "weight": 177602, >> "pos": 2}, >> { "id": 9, >> "weight": 177602, >> "pos": 3}]}, >> { "id": -7, >> "name": "ceph8", >> "type_id": 1, >> "type_name": "host", >> "weight": 0, >> "alg": "straw", >> "hash": "rjenkins1", >> "items": []}], >> "rules": [ >> { "rule_id": 0, >> "rule_name": "data", >> "ruleset": 0, >> "type": 1, >> "min_size": 1, >> "max_size": 10, >> "steps": [ >> { "op": "take", >> "item": -1}, >> { "op": "chooseleaf_firstn", >> "num": 0, >> "type": "host"}, >> { "op": "emit"}]}, >> { "rule_id": 1, >> "rule_name": "metadata", >> "ruleset": 1, >> "type": 1, >> "min_size": 1, >> "max_size": 10, >> "steps": [ >> { "op": "take", >> "item": -1}, >> { "op": "chooseleaf_firstn", >> "num": 0, >> "type": "host"}, >> { "op": "emit"}]}, >> { "rule_id": 2, >> "rule_name": "rbd", >> "ruleset": 2, >> "type": 1, >> "min_size": 1, >> "max_size": 10, >> "steps": [ >> { "op": "take", >> "item": -1}, >> { "op": "chooseleaf_firstn", >> "num": 0, >> "type": "host"}, >> { "op": "emit"}]}, >> { "rule_id": 3, >> "rule_name": "own1", >> "ruleset": 3, >> "type": 1, >> "min_size": 1, >> "max_size": 20, >> "steps": [ >> { "op": "take", >> "item": -1}, >> { "op": "chooseleaf_firstn", >> "num": 0, >> "type": "host"}, >> { "op": "emit"}]}], >> "tunables": { "choose_local_tries": 0, >> "choose_local_fallback_tries": 0, >> "choose_total_tries": 50, >> "chooseleaf_descend_once": 1}} >> >> Ugis >> >> 2013/11/21 John Wilkins <john.wilkins@xxxxxxxxxxx>: >>> Ugis, >>> >>> Can you provide the results for: >>> >>> ceph osd tree >>> ceph osd crush dump >>> >>> >>> >>> >>> >>> >>> On Thu, Nov 21, 2013 at 7:59 AM, Gregory Farnum <greg@xxxxxxxxxxx> wrote: >>>> On Thu, Nov 21, 2013 at 7:52 AM, Ugis <ugis22@xxxxxxxxx> wrote: >>>>> Thanks, reread that section in docs and found tunables profile - nice >>>>> to have, hadn't noticed it before(ceph docs develop so fast that you >>>>> need RSS to follow all changes :) ) >>>>> >>>>> Still problem persists in a different way. >>>>> Did set profile "optimal", reballancing started, but I had "rbd >>>>> delete" in background, in the end cluster ended up with negative >>>>> degradation % >>>>> I think I have hit bug http://tracker.ceph.com/issues/3720 which is >>>>> still open. >>>>> I did restart osds one by one and negative degradation dissapeared. >>>>> >>>>> Afterwards I added extra ~900GB data, degradation growed in process to 0.071% >>>>> This is rather http://tracker.ceph.com/issues/3747 which is closed, >>>>> but seems to happen still. >>>>> I did "ceph osd out X; sleep 40; ceph osd in X" for all osds, >>>>> degradation % went away. >>>>> >>>>> In the end I still have "55 active+remapped" pgs and no degradation %. >>>>> "pgmap v1853405: 2662 pgs: 2607 active+clean, 55 active+remapped; 5361 >>>>> GB data, 10743 GB used, 10852 GB / 21595 GB avail; 25230KB/s rd, >>>>> 203op/s" >>>>> >>>>> I queried some of remapped pgs, do not see why they do not >>>>> reballance(tunables are optimal now, checked). >>>>> >>>>> Where to look for the reason they are not reballancing? Is there >>>>> something to look for in osd logs if debug level is increased? >>>>> >>>>> one of those: >>>>> # ceph pg 4.5e query >>>>> { "state": "active+remapped", >>>>> "epoch": 9165, >>>>> "up": [ >>>>> 9], >>>>> "acting": [ >>>>> 9, >>>>> 5], >>>> >>>> For some reason CRUSH is still failing to map all the PGs to two hosts >>>> (notice how the "up" set is only one OSD, so it's adding another one >>>> in "acting") — what's your CRUSH map look like? >>>> -Greg >>>> Software Engineer #42 @ http://inktank.com | http://ceph.com >>>> _______________________________________________ >>>> ceph-users mailing list >>>> ceph-users@xxxxxxxxxxxxxx >>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>> >>> >>> >>> -- >>> John Wilkins >>> Senior Technical Writer >>> Intank >>> john.wilkins@xxxxxxxxxxx >>> (415) 425-9599 >>> http://inktank.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com