Re: Misplaced objects greater than 100%

Johan Hattne <johan@xxxxxxxxx> · Fri, 31 Mar 2023 16:01:39 -0700

Here goes:

# ceph -s
  cluster:
    id:     e1327a10-8b8c-11ed-88b9-3cecef0e3946
    health: HEALTH_OK

  services:
    mon: 5 daemons, quorum 
bcgonen-a,bcgonen-b,bcgonen-c,bcgonen-r0h0,bcgonen-r0h1 (age 16h)
    mgr: bcgonen-b.furndm(active, since 8d), standbys: bcgonen-a.qmmqxj
    mds: 1/1 daemons up, 2 standby
    osd: 36 osds: 36 up (since 16h), 36 in (since 3d); 1041 remapped pgs

  data:
    volumes: 1/1 healthy
    pools:   3 pools, 1041 pgs
    objects: 5.42M objects, 6.5 TiB
    usage:   19 TiB used, 428 TiB / 447 TiB avail
    pgs:     27087125/16252275 objects misplaced (166.667%)
             1039 active+clean+remapped
             2    active+clean+remapped+scrubbing+deep

# ceph osd tree
ID   CLASS  WEIGHT     TYPE NAME              STATUS  REWEIGHT  PRI-AFF
-14         149.02008  rack rack-1
 -7         149.02008      host bcgonen-r1h0
 20    hdd   14.55269          osd.20             up   1.00000  1.00000
 21    hdd   14.55269          osd.21             up   1.00000  1.00000
 22    hdd   14.55269          osd.22             up   1.00000  1.00000
 23    hdd   14.55269          osd.23             up   1.00000  1.00000
 24    hdd   14.55269          osd.24             up   1.00000  1.00000
 25    hdd   14.55269          osd.25             up   1.00000  1.00000
 26    hdd   14.55269          osd.26             up   1.00000  1.00000
 27    hdd   14.55269          osd.27             up   1.00000  1.00000
 28    hdd   14.55269          osd.28             up   1.00000  1.00000
 29    hdd   14.55269          osd.29             up   1.00000  1.00000
 34    ssd    1.74660          osd.34             up   1.00000  1.00000
 35    ssd    1.74660          osd.35             up   1.00000  1.00000
-13         298.04016  rack rack-0
 -3         149.02008      host bcgonen-r0h0
  0    hdd   14.55269          osd.0              up   1.00000  1.00000
  1    hdd   14.55269          osd.1              up   1.00000  1.00000
  2    hdd   14.55269          osd.2              up   1.00000  1.00000
  3    hdd   14.55269          osd.3              up   1.00000  1.00000
  4    hdd   14.55269          osd.4              up   1.00000  1.00000
  5    hdd   14.55269          osd.5              up   1.00000  1.00000
  6    hdd   14.55269          osd.6              up   1.00000  1.00000
  7    hdd   14.55269          osd.7              up   1.00000  1.00000
  8    hdd   14.55269          osd.8              up   1.00000  1.00000
  9    hdd   14.55269          osd.9              up   1.00000  1.00000
 30    ssd    1.74660          osd.30             up   1.00000  1.00000
 31    ssd    1.74660          osd.31             up   1.00000  1.00000
 -5         149.02008      host bcgonen-r0h1
 10    hdd   14.55269          osd.10             up   1.00000  1.00000
 11    hdd   14.55269          osd.11             up   1.00000  1.00000
 12    hdd   14.55269          osd.12             up   1.00000  1.00000
 13    hdd   14.55269          osd.13             up   1.00000  1.00000
 14    hdd   14.55269          osd.14             up   1.00000  1.00000
 15    hdd   14.55269          osd.15             up   1.00000  1.00000
 16    hdd   14.55269          osd.16             up   1.00000  1.00000
 17    hdd   14.55269          osd.17             up   1.00000  1.00000
 18    hdd   14.55269          osd.18             up   1.00000  1.00000
 19    hdd   14.55269          osd.19             up   1.00000  1.00000
 32    ssd    1.74660          osd.32             up   1.00000  1.00000
 33    ssd    1.74660          osd.33             up   1.00000  1.00000
 -1                 0  root default

# ceph osd pool ls detail
pool 1 '.mgr' replicated size 3 min_size 2 crush_rule 0 object_hash 
rjenkins pg_num 1 pgp_num 1 autoscale_mode on last_change 31 flags 
hashpspool stripe_width 0 pg_num_max 32 pg_num_min 1 application mgr
pool 2 'cephfs.cephfs.meta' replicated size 3 min_size 2 crush_rule 2 
object_hash rjenkins pg_num 16 pgp_num 16 autoscale_mode on last_change 
9833 lfor 0/0/584 flags hashpspool stripe_width 0 pg_autoscale_bias 4 
pg_num_min 16 recovery_priority 5 application cephfs
pool 3 'cephfs.cephfs.data' replicated size 3 min_size 2 crush_rule 1 
object_hash rjenkins pg_num 1024 pgp_num 1024 autoscale_mode on 
last_change 7630 lfor 0/1831/6544 flags hashpspool,bulk stripe_width 0 
application cephfs

crush_rules 1 and 2 are just used to assign the data and meta pool to 
HDD and SSD, respectively (failure domain: host).

// J

On 2023-03-31 15:37, ceph@xxxxxxxxxx wrote:
Need to know some more about your cluster...

Ceph -s
Ceph osd df tree
Replica or ec?
...

Perhaps this can give us some insight
Mehmet

Am 31. März 2023 18:08:38 MESZ schrieb Johan Hattne <johan@xxxxxxxxx>:

    Dear all;

    Up until a few hours ago, I had a seemingly normally-behaving cluster (Quincy, 17.2.5) with 36 OSDs, evenly distributed across 3 of its 6 nodes.  The cluster is only used for CephFS and the only non-standard configuration I can think of is that I had 2 active MDSs, but only 1 standby.  I had also doubled mds_cache_memory limit to 8 GB (all OSD hosts have 256 G of RAM) at some point in the past.

    Then I rebooted one of the OSD nodes.  The rebooted node held one of the active MDSs.  Now the node is back up: ceph -s says the cluster is healthy, but all PGs are in a active+clean+remapped state and 166.67% of the objects are misplaced (dashboard: -66.66% healthy).

    The data pool is a threefold replica with 5.4M object,  the number of misplaced objects is reported as 27087410/16252446.  The denominator in the ratio makes sense to me (16.2M / 3 = 5.4M), but the numerator does not.  I also note that the ratio is *exactly* 5 / 3.  The filesystem is still mounted and appears to be usable, but df reports it as 100% full; I suspect it would say 167% but that is capped somewhere.

    Any ideas about what is going on?  Any suggestions for recovery?

    // Best wishes; Johan
    ------------------------------------------------------------------------
    ceph-users mailing list -- ceph-users@xxxxxxx
    To unsubscribe send an email to ceph-users-leave@xxxxxxx

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx