Hi, Upgaded to emperor, restarted all nodes. Still have "31 actige+remapped" pgs. Compared remapped and healthy pg query output - some remapped pgs do not have data, some do, some have been scrubbed some don't. Now running read for whole rbd - may be that would trigger those stuck pgs. state on remapped pgs like: { "state": "active+remapped", "epoch": 9420, "up": [ 9], "acting": [ 9, 5], Any help/hints how to trigger those stuck pgs to up state on 2 osds? Ugis 2013/11/22 Ugis <ugis22@xxxxxxxxx>: > Update: I noticed that I hadn't increased pgp_num for default data > pool for which I increased pg_num time ago. So I did now and some > backfilling happened. > Now I still have "31 actige+remapped" pgs. > Remapped pgs belong to all pools, even those where is no data. > To me suspicious is that host ceph8 has weight 10.88(I had some osds > there temporarily, but due to low ram I remover those) > If that is of importance ceph7 is also low on ram(4GB) and is slower > to respond at times than ceph5(Sage mentioned "lagging pg peering > workqueue" in Bug#3747). > > Results follow: > # ceph osd tree > # id weight type name up/down reweight > -5 0 root slow > -4 0 host ceph5-slow > -1 32.46 root default > -2 10.5 host ceph5 > 0 0.2 osd.0 up 0 > 2 2.8 osd.2 up 1 > 3 2.8 osd.3 up 1 > 4 1.9 osd.4 up 1 > 5 2.8 osd.5 up 1 > -3 0.2 host ceph6 > 1 0.2 osd.1 up 0 > -6 10.88 host ceph7 > 6 2.73 osd.6 up 1 > 7 2.73 osd.7 up 1 > 8 2.71 osd.8 up 1 > 9 2.71 osd.9 up 1 > -7 10.88 host ceph8 > > # ceph osd crush dump > { "devices": [ > { "id": 0, > "name": "osd.0"}, > { "id": 1, > "name": "osd.1"}, > { "id": 2, > "name": "osd.2"}, > { "id": 3, > "name": "osd.3"}, > { "id": 4, > "name": "osd.4"}, > { "id": 5, > "name": "osd.5"}, > { "id": 6, > "name": "osd.6"}, > { "id": 7, > "name": "osd.7"}, > { "id": 8, > "name": "osd.8"}, > { "id": 9, > "name": "osd.9"}], > "types": [ > { "type_id": 0, > "name": "osd"}, > { "type_id": 1, > "name": "host"}, > { "type_id": 2, > "name": "rack"}, > { "type_id": 3, > "name": "row"}, > { "type_id": 4, > "name": "room"}, > { "type_id": 5, > "name": "datacenter"}, > { "type_id": 6, > "name": "root"}], > "buckets": [ > { "id": -1, > "name": "default", > "type_id": 6, > "type_name": "root", > "weight": 2127297, > "alg": "straw", > "hash": "rjenkins1", > "items": [ > { "id": -2, > "weight": 688128, > "pos": 0}, > { "id": -3, > "weight": 13107, > "pos": 1}, > { "id": -6, > "weight": 713031, > "pos": 2}, > { "id": -7, > "weight": 713031, > "pos": 3}]}, > { "id": -2, > "name": "ceph5", > "type_id": 1, > "type_name": "host", > "weight": 688125, > "alg": "straw", > "hash": "rjenkins1", > "items": [ > { "id": 0, > "weight": 13107, > "pos": 0}, > { "id": 2, > "weight": 183500, > "pos": 1}, > { "id": 3, > "weight": 183500, > "pos": 2}, > { "id": 4, > "weight": 124518, > "pos": 3}, > { "id": 5, > "weight": 183500, > "pos": 4}]}, > { "id": -3, > "name": "ceph6", > "type_id": 1, > "type_name": "host", > "weight": 13107, > "alg": "straw", > "hash": "rjenkins1", > "items": [ > { "id": 1, > "weight": 13107, > "pos": 0}]}, > { "id": -4, > "name": "ceph5-slow", > "type_id": 1, > "type_name": "host", > "weight": 0, > "alg": "straw", > "hash": "rjenkins1", > "items": []}, > { "id": -5, > "name": "slow", > "type_id": 6, > "type_name": "root", > "weight": 0, > "alg": "straw", > "hash": "rjenkins1", > "items": [ > { "id": -4, > "weight": 0, > "pos": 0}]}, > { "id": -6, > "name": "ceph7", > "type_id": 1, > "type_name": "host", > "weight": 713030, > "alg": "straw", > "hash": "rjenkins1", > "items": [ > { "id": 6, > "weight": 178913, > "pos": 0}, > { "id": 7, > "weight": 178913, > "pos": 1}, > { "id": 8, > "weight": 177602, > "pos": 2}, > { "id": 9, > "weight": 177602, > "pos": 3}]}, > { "id": -7, > "name": "ceph8", > "type_id": 1, > "type_name": "host", > "weight": 0, > "alg": "straw", > "hash": "rjenkins1", > "items": []}], > "rules": [ > { "rule_id": 0, > "rule_name": "data", > "ruleset": 0, > "type": 1, > "min_size": 1, > "max_size": 10, > "steps": [ > { "op": "take", > "item": -1}, > { "op": "chooseleaf_firstn", > "num": 0, > "type": "host"}, > { "op": "emit"}]}, > { "rule_id": 1, > "rule_name": "metadata", > "ruleset": 1, > "type": 1, > "min_size": 1, > "max_size": 10, > "steps": [ > { "op": "take", > "item": -1}, > { "op": "chooseleaf_firstn", > "num": 0, > "type": "host"}, > { "op": "emit"}]}, > { "rule_id": 2, > "rule_name": "rbd", > "ruleset": 2, > "type": 1, > "min_size": 1, > "max_size": 10, > "steps": [ > { "op": "take", > "item": -1}, > { "op": "chooseleaf_firstn", > "num": 0, > "type": "host"}, > { "op": "emit"}]}, > { "rule_id": 3, > "rule_name": "own1", > "ruleset": 3, > "type": 1, > "min_size": 1, > "max_size": 20, > "steps": [ > { "op": "take", > "item": -1}, > { "op": "chooseleaf_firstn", > "num": 0, > "type": "host"}, > { "op": "emit"}]}], > "tunables": { "choose_local_tries": 0, > "choose_local_fallback_tries": 0, > "choose_total_tries": 50, > "chooseleaf_descend_once": 1}} > > Ugis > > 2013/11/21 John Wilkins <john.wilkins@xxxxxxxxxxx>: >> Ugis, >> >> Can you provide the results for: >> >> ceph osd tree >> ceph osd crush dump >> >> >> >> >> >> >> On Thu, Nov 21, 2013 at 7:59 AM, Gregory Farnum <greg@xxxxxxxxxxx> wrote: >>> On Thu, Nov 21, 2013 at 7:52 AM, Ugis <ugis22@xxxxxxxxx> wrote: >>>> Thanks, reread that section in docs and found tunables profile - nice >>>> to have, hadn't noticed it before(ceph docs develop so fast that you >>>> need RSS to follow all changes :) ) >>>> >>>> Still problem persists in a different way. >>>> Did set profile "optimal", reballancing started, but I had "rbd >>>> delete" in background, in the end cluster ended up with negative >>>> degradation % >>>> I think I have hit bug http://tracker.ceph.com/issues/3720 which is >>>> still open. >>>> I did restart osds one by one and negative degradation dissapeared. >>>> >>>> Afterwards I added extra ~900GB data, degradation growed in process to 0.071% >>>> This is rather http://tracker.ceph.com/issues/3747 which is closed, >>>> but seems to happen still. >>>> I did "ceph osd out X; sleep 40; ceph osd in X" for all osds, >>>> degradation % went away. >>>> >>>> In the end I still have "55 active+remapped" pgs and no degradation %. >>>> "pgmap v1853405: 2662 pgs: 2607 active+clean, 55 active+remapped; 5361 >>>> GB data, 10743 GB used, 10852 GB / 21595 GB avail; 25230KB/s rd, >>>> 203op/s" >>>> >>>> I queried some of remapped pgs, do not see why they do not >>>> reballance(tunables are optimal now, checked). >>>> >>>> Where to look for the reason they are not reballancing? Is there >>>> something to look for in osd logs if debug level is increased? >>>> >>>> one of those: >>>> # ceph pg 4.5e query >>>> { "state": "active+remapped", >>>> "epoch": 9165, >>>> "up": [ >>>> 9], >>>> "acting": [ >>>> 9, >>>> 5], >>> >>> For some reason CRUSH is still failing to map all the PGs to two hosts >>> (notice how the "up" set is only one OSD, so it's adding another one >>> in "acting") — what's your CRUSH map look like? >>> -Greg >>> Software Engineer #42 @ http://inktank.com | http://ceph.com >>> _______________________________________________ >>> ceph-users mailing list >>> ceph-users@xxxxxxxxxxxxxx >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> >> >> >> -- >> John Wilkins >> Senior Technical Writer >> Intank >> john.wilkins@xxxxxxxxxxx >> (415) 425-9599 >> http://inktank.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com