On 13.12.2018 18:19, Alex Gorbachev wrote:
On Thu, Dec 13, 2018 at 10:48 AM Dietmar Rieder
<dietmar.rieder@xxxxxxxxxxx> wrote:
Hi Cephers,
one of our OSD nodes is experiencing a Disk controller problem/failure
(frequent resetting), so the OSDs on this controller are flapping
(up/down in/out).
I will hopefully get the replacement part soon.
I have some simple questions, what are the best steps to take now before
an after replacement of the controller?
- marking down and shutting down all osds on that node?
- waiting for rebalance is finished
- replace the controller
- just restart the osds? Or redeploy them, since they still hold data?
We are running:
ceph version 12.2.7 (3ec878d1e53e1aeb47a9f619c49d9e7c0aa384d5) luminous
(stable)
CentOS 7.5
Sorry for my naive questions.
I usually do ceph osd set noout first to prevent any recoveries
Then replace the hardware and make sure all OSDs come back online
Then ceph osd unset noout
Best regards,
Alex
Setting noout prevents the osd's from re-balancing. ie when you do a
short fix and do not want it to start re-balancing, since you know the
data will be available shortly.. eg a reboot or similar.
if osd's are flapping you normally want them out of the cluster, so they
do not impact performance any more.
kind regards
Ronny Aasen
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com