Thank you, Eugen. It was actually very straightforward. I'm happy to report back that there were no issues with removing and zapping the OSDs whose data devices were unavailable. I had to manually remove stale dm entries, but that was it. /Z On Tue, 2 Apr 2024 at 11:00, Eugen Block <eblock@xxxxxx> wrote: > Hi, > > here's the link to the docs [1] how to replace OSDs. > > ceph orch osd rm <OSD_ID> --replace --zap [--force] > > This should zap both the data drive and db LV (yes, its data is > useless without the data drive), not sure how it will handle if the > data drive isn't accessible though. > One thing I'm not sure about is how your spec file will be handled. > Since the drive letters can change I recommend to use a more generic > approach, for example the rotational flags and drive sizes instead of > paths. But if the drive letters won't change for the replaced drives > it should work. I also don't expect an impact on the rest of the OSDs > (except for backfilling, of course). > > Regards, > Eugen > > [1] https://docs.ceph.com/en/latest/cephadm/services/osd/#replacing-an-osd > > Zitat von Zakhar Kirpichenko <zakhar@xxxxxxxxx>: > > > Hi, > > > > Unfortunately, some of our HDDs failed and we need to replace these > drives > > which are parts of "combined" OSDs (DB/WAL on NVME, block storage on > HDD). > > All OSDs are defined with a service definition similar to this one: > > > > ``` > > service_type: osd > > service_id: ceph02_combined_osd > > service_name: osd.ceph02_combined_osd > > placement: > > hosts: > > - ceph02 > > spec: > > data_devices: > > paths: > > - /dev/sda > > - /dev/sdb > > - /dev/sdc > > - /dev/sdd > > - /dev/sde > > - /dev/sdf > > - /dev/sdg > > - /dev/sdh > > - /dev/sdi > > db_devices: > > paths: > > - /dev/nvme0n1 > > - /dev/nvme1n1 > > filter_logic: AND > > objectstore: bluestore > > ``` > > > > In the above example, HDDs `sda` and `sdb` are not readable and data > cannot > > be copied over to new HDDs. NVME partitions of `nvme0n1` with DB/WAL data > > are intact, but I guess that data is useless. I think the best approach > is > > to replace the dead drives and completely rebuild each affected OSD. How > > should we go about this, preferably in a way that other OSDs on the node > > remain unaffected and operational? > > > > I would appreciate any advice or pointers to the relevant documentation. > > > > Best regards, > > Zakhar > > _______________________________________________ > > ceph-users mailing list -- ceph-users@xxxxxxx > > To unsubscribe send an email to ceph-users-leave@xxxxxxx > > > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx > _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx