OSD stuck during a two-OSD drain

Nicola Mori <mori@xxxxxxxxxx> · Fri, 20 Dec 2024 08:49:34 +0100

Dear Ceph users,

I'm upgrading some disks of my cluster (Squid 19.2.0 managed by cephadm, 
in which basically I have only a 6+2 EC pool over 12 hosts). To speed up 
the operations I issued a ceph orch osd rm --replace for two OSDs in two 
different hosts; the drain started for both and for one OSD finished 
smoothly and it is now in destroyed state. But for the second OSD it 
stopped with a single PG remaining to be moved away before the OSD is 
completely drained:

# ceph orch osd rm status
OSD  HOST     STATE     PGS  REPLACE  FORCE  ZAP    DRAIN STARTED AT 

31   rokanan  draining    1  True     False  False  2024-12-19 
08:57:36.458704+00:00

and there is no backfill activity going on, even i f the PG is labeled 
as backfilling :

# ceph -s
  cluster:
    id:     b1029256-7bb3-11ec-a8ce-ac1f6b627b45
    health: HEALTH_WARN
            52 pgs not deep-scrubbed in time
            (muted: OSD_SLOW_PING_TIME_BACK OSD_SLOW_PING_TIME_FRONT)

  services:
    mon: 5 daemons, quorum bofur,fili,aka,bifur,romolo (age 7d)
    mgr: fili.olevnm(active, since 18h), standbys: bofur.tklnrn, 
bifur.htimkf
    mds: 2/2 daemons up, 1 standby
    osd: 124 osds: 123 up (since 4h), 122 in (since 22h); 1 remapped pgs

  data:
    volumes: 1/1 healthy
    pools:   3 pools, 529 pgs
    objects: 27.11M objects, 78 TiB
    usage:   104 TiB used, 162 TiB / 266 TiB avail
    pgs:     53120/216457202 objects misplaced (0.025%)
             302 active+clean
             178 active+clean+scrubbing
             48  active+clean+scrubbing+deep
             1   active+remapped+backfilling

Is all of the above normal? I guessed that maybe only one destroyed OSD 
at once can exist in the cluster, and that after replacing its disk and 
recreating it the drain for the second one would resume and finish, is 
this plausible?
Thanks,

Nicola
Attachment:
smime.p7s

Description: S/MIME Cryptographic Signature
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx