OSD Marked down unable to restart continuously failing

Radhakrishnan2 S <radhakrishnan2.s@xxxxxxx> · Thu, 9 Jan 2020 05:34:50 -0800

Hello Everyone, 

One of the OSD node out of 16 has 12 OSD's with a bcache as NVMe, locally those osd daemons seem to be up and running, while the ceph osd tree shows them as down. Logs show that OSD's have struck IO for over 4096 sec. 

I tried checking for iostat, netstat, ceph -w  along with the logs. Is there a way to identify why this happening ? In addition, when I restart the OSD daemons on the respective OSD node, restart is failing. Any quick help pls.

Regards

Radha Krishnan S

TCS Enterprise Cloud Practice

Tata Consultancy Services

Cell:- +1 848 466 4870

Mailto: radhakrishnan2.s@xxxxxxx

Website: http://www.tcs.com

____________________________________________

Experience certainty.	IT Services

			Business Solutions

			Consulting

____________________________________________

-----"ceph-users" <ceph-users-bounces@xxxxxxxxxxxxxx> wrote: -----To: d.aberger@xxxxxxxxxxxx, "Janne Johansson" <icepic.dz@xxxxxxxxx>
From: "Wido den Hollander" 
Sent by: "ceph-users" 
Date: 01/09/2020 08:19AM
Cc: "Ceph Users" <ceph-users@xxxxxxxxxxxxxx>, a.brandt@xxxxxxxxxxxx, "p.kramme@xxxxxxxxxxxx" <p.kramme@xxxxxxxxxxxx>, j.kruse@xxxxxxxxxxxx
Subject: Re:  Looking for experience

"External email. Open with Caution"

On 1/9/20 2:07 PM, Daniel Aberger - Profihost AG wrote:
> 
> Am 09.01.20 um 13:39 schrieb Janne Johansson:
>>
>>     I'm currently trying to workout a concept for a ceph cluster which can
>>     be used as a target for backups which satisfies the following
>>     requirements:
>>
>>     - approx. write speed of 40.000 IOP/s and 2500 Mbyte/s
>>
>>
>> You might need to have a large (at least non-1) number of writers to get
>> to that sum of operations, as opposed to trying to reach it with one
>> single stream written from one single client. 
> 
> 
> We are aiming for about 100 writers.

So if I read it correctly the writes will be 64k each.

That should be doable, but you probably want something like NVMe for DB+WAL.

You might want to tune that larger writes also go into the WAL to speed
up the ingress writes. But you mainly want more spindles then less.

Wido

> 
> Cheers
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
=====-----=====-----=====

Notice: The information contained in this e-mail

message and/or attachments to it may contain 

confidential or privileged information. If you are 

not the intended recipient, any dissemination, use, 

review, distribution, printing or copying of the 

information contained in this e-mail message 

and/or attachments to it are strictly prohibited. If 

you have received this communication in error, 

please notify us by reply e-mail or telephone and 

immediately and permanently delete the message 

and any attachments. Thank you

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com