1 PG remains remapped after recovery

Frank Schilder <frans@xxxxxx> · Sat, 27 Aug 2022 17:18:50 +0000

Hi all,

our test cluster (octopus 15.2.16) ended up in a weird state:

  cluster:
    id:     bf1f51f5-b381-4cf7-b3db-88d044c1960c
    health: HEALTH_OK

  services:
    mon: 3 daemons, quorum tceph-01,tceph-03,tceph-02 (age 4w)
    mgr: tceph-01(active, since 4w), standbys: tceph-02, tceph-03
    mds: fs:1 {0=tceph-02=up:active} 2 up:standby
    osd: 9 osds: 8 up (since 29h), 8 in (since 28h); 1 remapped pgs

  data:
    pools:   4 pools, 321 pgs
    objects: 10.40M objects, 348 GiB
    usage:   1.7 TiB used, 442 GiB / 2.2 TiB avail
    pgs:     39434/46694661 objects misplaced (0.084%)
             205 active+clean+snaptrim_wait
             99  active+clean
             16  active+clean+snaptrim
             1   active+clean+remapped+snaptrim_wait

  io:
    client:   19 KiB/s rd, 22 MiB/s wr, 2 op/s rd, 174 op/s wr

As part of the testing we failed an OSD to benchmark client IO under recovery. Strangely enough, after the cluster recovered, 1 PG remains in state remapped. Despite that, health is OK. This seems problematic, because the PG will probably accumulate PG_LOG entries until the remapped state is cleared. The history versions look already wildly different. Here the full PG state:

PG_STAT  OBJECTS  MISSING_ON_PRIMARY  DEGRADED  MISPLACED  UNFOUND  BYTES       OMAP_BYTES*  OMAP_KEYS*  LOG   DISK_LOG  STATE                                STATE_STAMP                      VERSION       REPORTED      UP                UP_PRIMARY  ACTING         ACTING_PRIMARY  LAST_SCRUB    SCRUB_STAMP                      LAST_DEEP_SCRUB  DEEP_SCRUB_STAMP                 SNAPTRIMQ_LEN
4.1c       39438                   0         0      39438        0  2825704691            0           0  1933      1933  active+clean+remapped+snaptrim_wait  2022-08-27T19:05:15.144083+0200  4170'3108053  4170:3022415  [6,1,4,5,3,NONE]           6  [6,1,4,5,3,1]               6  3312'2843531  2022-08-24T22:40:42.482024+0200     2832'2067159  2022-08-21T02:13:17.023702+0200             49

Any ideas why this PG is stuck in remapped and does not rebalance objects? Is there a way to convince it to start rebalancing?

Thanks and Best regards,
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx