Hi All,
What would be the proper way to preventively replace a DB/WAL SSD (when it is nearing it's DWPD/TBW limit and not failed yet).
It hosts DB partitions for 5 OSD's
Maybe something like:
1) ceph osd reweight 0 the 5 OSD's
2) let backfilling complete
3) destroy/remove the 5 OSD's
4) replace SSD
5) create 5 new OSD's with seperate DB partition on new SSD
When these 5 OSD's are big HDD's (8TB) a LOT of data has to be moved so i thought maybe the following would work:
1) ceph osd set noout
2) stop the 5 OSD's (systemctl stop)
3) 'dd' the old SSD to a new SSD of same or bigger size
4) remove the old SSD
5) start the 5 OSD's (systemctl start)
6) let backfilling/recovery complete (only delta data between OSD stop and now)
6) ceph osd unset noout
Would this be a viable method to replace a DB SSD? Any udev/serial nr/uuid stuff preventing this to work?
Or is there another 'less hacky' way to replace a DB SSD without moving too much data?
Kind regards,
Caspar
Caspar
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com