OSD processes/daemon running as is...So ceph not making those OSD down or out.
But as battery failed, which leads temperature high, leads CPU utlization increased - leads
OSD response time more, so that other OSDs failed to response on time..
causing the utter slow or no IO...
On Tue, Apr 16, 2019 at 12:23 PM Eugen Block <eblock@xxxxxx> wrote:
Good morning,
the OSDs are usually marked out after 10 minutes, that's when
rebalancing starts. But the I/O should not drop during that time, this
could be related to your pool configuration. If you have a replicated
pool of size 3 and also set min_size to 3 the I/O would pause if a
node or OSD fails. So more information about the cluster would help,
can you share that?
ceph osd tree
ceph osd pool ls detail
Were all pools affected or just specific pools?
Regards,
Eugen
Zitat von M Ranga Swami Reddy <swamireddy@xxxxxxxxx>:
> Hello - Recevenlt we had an issue with storage node's battery failure,
> which cause ceph client IO dropped to '0' bytes. Means ceph cluster
> couldn't perform IO operations on the cluster till the node takes out. This
> is not expected from Ceph, as some HW fails, those respective OSDs should
> mark as out/down and IO should go as is..
>
> Please let me know if anyone seen the similar behavior and is this issue
> resolved?
>
> Thanks
> Swami
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com