Re: Acting sets sometimes may violate crush rule ?

Dan van der Ster <dan@xxxxxxxxxxxxxx> · Tue, 14 Jan 2020 07:51:48 +0100

Hi,
One way this can happen is if you change the crush rule of a pool after the balancer has been running awhile.
This is because the balancer upmaps are only validated when they are initially created.

ceph osd dump | grep upmap

Does it explain your issue?

.. Dan

On Tue, 14 Jan 2020, 04:17 Yi-Cian Pu, <yician1000ceph@xxxxxxxxx> wrote:
Hi all,
We sometimes can observe that acting set seems to violate crush rule. For example, we had an environment before: 

[root@Ann-per-R7-3 /]# ceph -s
  cluster:
    id:     248ce880-f57b-4a4c-a53a-3fc2b3eb142a
    health: HEALTH_WARN
            34/8019 objects misplaced (0.424%)

  services:
    mon: 3 daemons, quorum Ann-per-R7-3,Ann-per-R7-7,Ann-per-R7-1
    mgr: Ann-per-R7-3(active), standbys: Ann-per-R7-7, Ann-per-R7-1
    mds: cephfs-1/1/1 up  {0=qceph-mds-Ann-per-R7-1=up:active}, 2 up:standby
    osd: 7 osds: 7 up, 7 in; 1 remapped pgs

  data:
    pools:   7 pools, 128 pgs
    objects: 2.67 k objects, 10 GiB
    usage:   107 GiB used, 3.1 TiB / 3.2 TiB avail
    pgs:     34/8019 objects misplaced (0.424%)
             127 active+clean
             1   active+clean+remapped

[root@Ann-per-R7-3 /]# ceph pg ls remapped
PG  OBJECTS DEGRADED MISPLACED UNFOUND BYTES     LOG STATE                 STATE_STAMP                VERSION REPORTED UP      ACTING    SCRUB_STAMP                DEEP_SCRUB_STAMP
1.7      34        0        34       0 134217728  42 active+clean+remapped 2019-11-05 10:39:58.639533  144'42  229:407 [6,1]p6 [6,1,2]p6 2019-11-04 10:36:19.519820 2019-11-04 10:36:19.519820

[root@Ann-per-R7-3 /]# ceph osd tree
ID CLASS WEIGHT  TYPE NAME             STATUS REWEIGHT PRI-AFF
-2             0 root perf_osd
-1       3.10864 root default
-7       0.44409     host Ann-per-R7-1
 5   hdd 0.44409         osd.5             up  1.00000 1.00000
-3       1.33228     host Ann-per-R7-3
 0   hdd 0.44409         osd.0             up  1.00000 1.00000
 1   hdd 0.44409         osd.1             up  1.00000 1.00000
 2   hdd 0.44409         osd.2             up  1.00000 1.00000
-9       1.33228     host Ann-per-R7-7
 6   hdd 0.44409         osd.6             up  1.00000 1.00000
 7   hdd 0.44409         osd.7             up  1.00000 1.00000
 8   hdd 0.44409         osd.8             up  1.00000 1.00000

[root@Ann-per-R7-3 /]# ceph osd df
ID CLASS WEIGHT  REWEIGHT SIZE    USE     AVAIL   %USE VAR  PGS
 5   hdd 0.44409  1.00000 465 GiB  21 GiB 444 GiB 4.49 1.36 127
 0   hdd 0.44409  1.00000 465 GiB  15 GiB 450 GiB 3.16 0.96  44
 1   hdd 0.44409  1.00000 465 GiB  15 GiB 450 GiB 3.14 0.95  52
 2   hdd 0.44409  1.00000 465 GiB  14 GiB 451 GiB 2.98 0.91  33
 6   hdd 0.44409  1.00000 465 GiB  14 GiB 451 GiB 2.97 0.90  43
 7   hdd 0.44409  1.00000 465 GiB  15 GiB 450 GiB 3.19 0.97  41
 8   hdd 0.44409  1.00000 465 GiB  14 GiB 450 GiB 3.09 0.94  44
                    TOTAL 3.2 TiB 107 GiB 3.1 TiB 3.29
MIN/MAX VAR: 0.90/1.36  STDDEV: 0.49

Based on our crush map, crush rule should select 1 OSD from each host. However, from above log, we can see that an acting set is [6,1,2] and osd.1 and osd.2 are in the same host, which seems to violate crush rule. So, my question is how does this happen...? Any enlightenment is much appreciated.

Best
Cian

_______________________________________________

ceph-users mailing list

ceph-users@xxxxxxxxxxxxxx

http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com