Hi, you seem to be walking through the manual osd management steps,
check out the cephadm osd handling:
https://docs.ceph.com/en/quincy/cephadm/services/osd.html
Zitat von Matt Larson <larsonmattr@xxxxxxxxx>:
I have an OSD that is causing slow ops, and appears to be backed by a
failing drive according to smartctl outputs. I am using cephadm, and
wondering what is the best way to remove this drive from the cluster and
proper steps to replace the disk?
Mark the osd.35 as out.
`sudo ceph osd out osd.35`
Then mark osd.35 as down.
`sudo ceph osd down osd.35`
The OSD is marked as out, but it does come back up after a couple of
seconds. I do not know if that is a problem or to just let the drive stay
online as long as it lasts during the removal from the cluster.
After the recovery completes, I would then `destroy` the osd:
`ceph osd destroy {id} --yes-i-really-mean-it`
(https://docs.ceph.com/en/latest/rados/operations/add-or-rm-osds/)
Besides checking steps above, my question now is ..* If the drive is acting
very slow and causing slow ops, should I be trying to shut down its OSD
and keep it down? There is an example to stop the OSD on the server using
systemctl, outside of cephadm:*
ssh {osd-host}sudo systemctl stop ceph-osd@{osd-num}
Thanks,
Matt
--
Matt Larson, PhD
Madison, WI 53705 U.S.A.
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx