Good morning David,
Assuming you need/want to see the data about the other 31 OSDs, 14 is showing:
ID |
CLASS |
WEIGHT |
REWEIGHT |
SIZE |
RAW USE |
DATA |
OMAP |
META |
AVAIL |
%USE |
VAR |
PGS |
STATUS |
14 |
hdd |
2.72899 |
0 |
0 B |
0 B |
0 B |
0 B |
0 B |
0 B |
0 |
0 |
1 |
up |
What's "ceph osd df" show?
On Wed, Dec 1, 2021 at 2:20 PM Zach Heise (SSCC) <heise@xxxxxxxxxxxx> wrote:
I wanted to swap out on existing OSD, preserve the number, and then remove the HDD that had it (osd.14 in this case) and give the ID of 14 to a new SSD that would be taking its place in the same node. First time ever doing this, so not sure what to expect.
I followed the instructions here, using the --replace flag.
However, I'm a bit concerned that the operation is taking so long in my test cluster. Out of 70TB in the cluster, only 40GB were in use. This is a relatively large OSD in comparison to others in the cluster (2.7TB versus 300GB for most other OSDs) and yet it's been 36 hours with the following status:
ceph04.ssc.wisc.edu> ceph orch osd rm status OSD_ID HOST STATE PG_COUNT REPLACE FORCE DRAIN_STARTED_AT 14 ceph04.ssc.wisc.edu draining 1 True True 2021-11-30 15:22:23.469150+00:00Another note: I don't know why it has the "force = true" set; the command that I ran was just Ceph orch osd rm 14 --replace, without specifying --force. Hopefully not a big deal but still strange.
At this point is there any way to tell if it's still actually doing something, or perhaps it is hung? if it is hung, what would be the 'recommended' way to proceed? I know that I could just manually eject the HDD from the chassis and run the "ceph osd crush remove osd.14" command and then manually delete the auth keys, etc, but the documentation seems to state that this shouldn't be necessary if a ceph OSD replacement goes properly.
_______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx