Re: Remove an OSD with hardware issue caused rgw 503

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

if you remove the OSD this way, it will be drained. Which means that it will try to recover PGs from this OSD, and in case of hardware failure it might lead to slow requests. It might make sense to forcefully remove the OSD without draining:

- stop the osd daemon
- mark it as out
- osd purge <id|osd.id> [--force] [--yes-i-really-mean-it]

Regards,
Eugen

Zitat von Mary Zhang <maryzhang0920@xxxxxxxxx>:

Hi,

We recently removed an osd from our Cepth cluster. Its underlying disk has
a hardware issue.

We use command: ceph orch osd rm osd_id --zap

During the process, sometimes ceph cluster enters warning state with slow
ops on this osd. Our rgw also failed to respond to requests and returned
503.

We restarted rgw daemon to make it work again. But the same failure occured
from time to time. Eventually we noticed that rgw 503 error is a result of
osd slow ops.

Our cluster has 18 hosts and 210 OSDs. We expect remove an osd with
hardware issue won't impact cluster performance & rgw availbility. Is our
expectation reasonable? What's the best way to handle osd with hardware
failures?

Thank you in advance for any comments or suggestions.

Best Regards,
Mary Zhang
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx


_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux