Hi Everyone,
Which is the best way to replace a failing (SMART Health Status:
HARDWARE IMPENDING FAILURE) OSD hard disk?
Normally I will:
1. set the OSD as out
2. wait for rebalancing
3. stop the OSD on the osd-server (unmount if needed)
4. purge the OSD from CEPH
5. physically replace the disk with the new one
6. with ceph-deploy:
6a zap the new disk (just in case)
6b create the new OSD
7. add the new osd to the crush map.
8. wait for rebalancing.
My questions are:
- Is my procedure reasonable?
- What if I skip the #2 and instead to wait for rebalancing I directly
purge the OSD?
- Is better to reweight the OSD before take it out?
I'm running a Luminous (12.2.2) cluster with 332 OSDs, failure domain is
host.
Thanks,
Iztok
--
Iztok Gregori
ICT Systems and Services
Elettra - Sincrotrone Trieste S.C.p.A.
Telephone: +39 040 3758948
http://www.elettra.eu
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com