Re: OSD stuck in booting state

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello folks,

 

Nobody to give me a hint ?

 

The communication and auth with mon is ok

 

2019-03-25 14:16:25.342 7fa3af260700 1 -- 10.8.33.158:6789/0 <== osd.0 10.8.33.183:6800/293177 184 ==== auth(proto 2 2 bytes epoch 0) v1 ==== 32+0+0 (2260890001 0 0) 0x559759ffd680 con 0x55975548700

0

2019-03-25 14:16:25.342 7fa3af260700 10 mon.2@1(peon).auth v146 preprocess_query auth(proto 2 2 bytes epoch 0) v1 from osd.0 10.8.33.183:6800/293177

2019-03-25 14:16:25.342 7fa3af260700 10 mon.2@1(peon).auth v146 prep_auth() blob_size=2

2019-03-25 14:16:25.342 7fa3af260700 2 mon.2@1(peon) e1 send_reply 0x55976b3bf320 0x559754bb1200 auth_reply(proto 2 0 (0) Success) v1

2019-03-25 14:16:25.342 7fa3af260700 1 -- 10.8.33.158:6789/0 --> 10.8.33.183:6800/293177 -- auth_reply(proto 2 0 (0) Success) v1 -- 0x559754bb1200 con 0

 

But the OSD is still in booting state

 

FSID seems correct… so I’m lost here…..

Nothing in the osd logs (even with debug to 20) except some complain about mgr which reject osd report because osd metadata not complete (I guess due to osd booting state)

 

One thing to notice, I came to this status after redeploying the VMs hosting Ceph cluster, so IP addresses have changed

 

Somebody to help ?

 

# ceph osd dump

epoch 15

fsid 5267611a-48f7-4979-823e-84531e104d63

created 2019-03-20 18:14:24.296267

modified 2019-03-22 14:38:45.816422

flags sortbitwise,recovery_deletes,purged_snapdirs

crush_version 5

full_ratio 0.95

backfillfull_ratio 0.9

nearfull_ratio 0.85

require_min_compat_client jewel

min_compat_client jewel

require_osd_release mimic

max_osd 3

osd.0 down in weight 1 up_from 0 up_thru 0 down_at 0 last_clean_interval [0,0) - - - - exists 32d92b43-6333-4c5c-8153-af373ce12e62

osd.1 down in weight 1 up_from 0 up_thru 0 down_at 0 last_clean_interval [0,0) - - - - exists 07b03870-1bd9-42f9-ac61-9e9be3b30e73

osd.2 down in weight 1 up_from 0 up_thru 0 down_at 0 last_clean_interval [0,0) - - - - exists b77f8ae8-82cf-4e31-9e36-f510698abf8e

 

Thank you !!

Vincent

 

De : PHARABOT Vincent
Envoyé : vendredi 22 mars 2019 10:45
À : 'ceph-users@xxxxxxxxxxxxxx' <ceph-users@xxxxxxxxxxxxxx>
Objet : OSD stuck in booting state

 

Hello cephers

 

I would need your help once again…. (still ceph beginner sorry)

 

In a cluster I have 3 osd which could not be seen as up, still stuck on down state. Of course osd process are running.

 

On osd side, the osd is stuck on booting state since a long time

It doesn’t look like a network or communication issue between osd and mon

 

I guess something wrong on osd side but could not figure out what for now…

 

Thanks a lot for your help !

 

# ceph -s

cluster:

id: 5267611a-48f7-4979-823e-84531e104d63

health: HEALTH_WARN

3 slow ops, oldest one blocked for 134780 sec, daemons [mon.1,mon.2] have slow ops.

 

services:

mon: 3 daemons, quorum 1,2,0

mgr: mgr.2(active), standbys: mgr.0, mgr.1

osd: 3 osds: 0 up, 0 in

 

data:

pools: 0 pools, 0 pgs

objects: 0 objects, 0 B

usage: 0 B used, 0 B / 0 B avail

pgs:

 

# ceph health detail

HEALTH_WARN 3 slow ops, oldest one blocked for 134795 sec, daemons [mon.1,mon.2] have slow ops.

SLOW_OPS 3 slow ops, oldest one blocked for 134795 sec, daemons [mon.1,mon.2] have slow ops.

 

# ceph osd tree

ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF

-1 2.92978 root default

-4 0.97659 host ip-10-8-33-183

0 hdd 0.97659 osd.0 down 0 1.00000

-3 0.97659 host ip-10-8-64-158

2 0.97659 osd.2 down 0 1.00000

-2 0.97659 host ip-10-8-85-231

 

# ceph osd dump

epoch 7

fsid 5267611a-48f7-4979-823e-84531e104d63

created 2019-03-20 18:14:24.296267

modified 2019-03-21 09:26:58.920300

flags sortbitwise,recovery_deletes,purged_snapdirs

crush_version 5

full_ratio 0.95

backfillfull_ratio 0.9

nearfull_ratio 0.85

require_min_compat_client jewel

min_compat_client jewel

require_osd_release mimic

max_osd 3

osd.0 down out weight 0 up_from 0 up_thru 0 down_at 0 last_clean_interval [0,0) - - - - exists,new 32d92b43-6333-4c5c-8153-af373ce12e62

osd.1 down out weight 0 up_from 0 up_thru 0 down_at 0 last_clean_interval [0,0) - - - - exists,new 07b03870-1bd9-42f9-ac61-9e9be3b30e73

osd.2 down out weight 0 up_from 0 up_thru 0 down_at 0 last_clean_interval [0,0) - - - - exists,new b77f8ae8-82cf-4e31-9e36-f510698abf8e

 

"ops": [

{

"description": "osd_boot(osd.0 booted 0 features 4611087854031142907 v17)",

"initiated_at": "2019-03-22 08:47:20.243710",

"age": 405.638170,

"duration": 405.638185,

"type_data": {

"events": [

{

"time": "2019-03-22 08:47:20.243710",

"event": "initiated"

},

{

"time": "2019-03-22 08:47:20.243710",

"event": "header_read"

},

{

"time": "2019-03-22 08:47:20.243713",

"event": "throttled"

},

{

"time": "2019-03-22 08:47:20.243766",

"event": "all_read"

},

{

"time": "2019-03-22 08:47:20.243821",

"event": "dispatched"

},

{

"time": "2019-03-22 08:47:20.243826",

"event": "mon:_ms_dispatch"

},

{

"time": "2019-03-22 08:47:20.243827",

"event": "mon:dispatch_op"

},

{

"time": "2019-03-22 08:47:20.243827",

"event": "psvc:dispatch"

},

{

"time": "2019-03-22 08:47:20.243828",

"event": "osdmap:wait_for_readable"

},

{

"time": "2019-03-22 08:47:20.243829",

"event": "osdmap:wait_for_finished_proposal"

},

{

"time": "2019-03-22 08:47:21.064088",

"event": "callback retry"

},

{

"time": "2019-03-22 08:47:21.064090",

"event": "psvc:dispatch"

},

{

"time": "2019-03-22 08:47:21.064091",

"event": "osdmap:wait_for_readable"

},

{

 

 

 

OSD side:

[root@ip-10-8-33-183 ~]# ceph daemon osd.0 status

{

"cluster_fsid": "5267611a-48f7-4979-823e-84531e104d63",

"osd_fsid": "32d92b43-6333-4c5c-8153-af373ce12e62",

"whoami": 0,

"state": "booting",

"oldest_map": 1,

"newest_map": 17,

"num_pgs": 200

}

 

Vincent

This email and any attachments are intended solely for the use of the individual or entity to whom it is addressed and may be confidential and/or privileged.

If you are not one of the named recipients or have received this email in error,

(i) you should not read, disclose, or copy it,

(ii) please notify sender of your receipt by reply email and delete this email and all attachments,

(iii) Dassault Systèmes does not accept or assume any liability or responsibility for any use of or reliance on this email.


Please be informed that your personal data are processed according to our data privacy policy as described on our website. Should you have any questions related to personal data protection, please contact 3DS Data Protection Officer at 3DS.compliance-privacy@xxxxxxx


For other languages, go to https://www.3ds.com/terms/email-disclaimer

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux