New OSD node failing quickly after startup.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi all,

A fresh built Ceph OSD node, is having an issue with OSD daemons failing quickly after startup.

All nodes are on Ceph (14.2.22) and CentOS 7. Any suggestions on how to troubleshoot the issue?

--------
# systemctl status ceph-osd@199 -l
● ceph-osd@199.service - Ceph object storage daemon osd.199
   Loaded: loaded (/usr/lib/systemd/system/ceph-osd@.service; enabled-runtime; vendor preset: disabled)
   Active: inactive (dead) since Sat 2021-08-21 20:34:37 PDT; 1min 3s ago
  Process: 40533 ExecStart=/usr/bin/ceph-osd -f --cluster ${CLUSTER} --id %i --setuser ceph --setgroup ceph (code=exited, status=0/SUCCESS)
  Process: 40526 ExecStartPre=/usr/lib/ceph/ceph-osd-prestart.sh --cluster ${CLUSTER} --id %i (code=exited, status=0/SUCCESS)
Main PID: 40533 (code=exited, status=0/SUCCESS)

Aug 21 20:34:37 ceph-host14 ceph-osd[40533]: 2021-08-21 20:34:37.397 7f744a338700 -1 osd.199 122252 *** Immediate shutdown (osd_fast_shutdown=true) ***
Aug 21 20:34:37 ceph-host14 ceph-osd[40533]: 2021-08-21 20:34:37.397 7f7437b13700 10 osd.199 122252 heartbeat_reset closing (old) failed hb con 0x5560cbe7e400
Aug 21 20:34:37 ceph-host14 ceph-osd[40533]: 2021-08-21 20:34:37.397 7f7437b13700 10 osd.199 122252 heartbeat_reset closing (old) failed hb con 0x5560cbe7e800
Aug 21 20:34:37 ceph-host14 ceph-osd[40533]: 2021-08-21 20:34:37.397 7f742eb01700 10 osd.199 pg_epoch: 122252 pg[12.2ecs5( v 114361'44811 (114345'41811,114361'44811] lb MIN (bitwise) local-lis/les=114582/114584 n=3399 ec=16432/16432 lis/c 118411/118406 les/c/f 118412/118407/0 122021/122021/113529) [141,452,415,433,42,470,427,104,126,251,437]p141(0) r=-1 lpr=122251 pi=[114323,122021)/5 crt=114361'44811 lcod 0'0 unknown NOTIFY mbc={}] take_waiters
Aug 21 20:34:37 ceph-host14 ceph-osd[40533]: 2021-08-21 20:34:37.397 7f742eb01700 20 osd.199 pg_epoch: 122252 pg[12.2ecs5( v 114361'44811 (114345'41811,114361'44811] lb MIN (bitwise) local-lis/les=114582/114584 n=3399 ec=16432/16432 lis/c 118411/118406 les/c/f 118412/118407/0 122021/122021/113529) [141,452,415,433,42,470,427,104,126,251,437]p141(0) r=-1 lpr=122251 pi=[114323,122021)/5 crt=114361'44811 lcod 0'0 unknown NOTIFY mbc={}] handle_activate_map: Not dirtying info: last_persisted is 122220 while current is 122252
Aug 21 20:34:37 ceph-host14 ceph-osd[40533]: 2021-08-21 20:34:37.397 7f742eb01700 10 log is not dirty
Aug 21 20:34:37 ceph-host14 ceph-osd[40533]: 2021-08-21 20:34:37.397 7f742eb01700 10 osd.199 pg_epoch: 122252 pg[12.2ecs5( v 114361'44811 (114345'41811,114361'44811] lb MIN (bitwise) local-lis/les=114582/114584 n=3399 ec=16432/16432 lis/c 118411/118406 les/c/f 118412/118407/0 122021/122021/113529) [141,452,415,433,42,470,427,104,126,251,437]p141(0) r=-1 lpr=122251 pi=[114323,122021)/5 crt=114361'44811 lcod 0'0 unknown NOTIFY mbc={}] do_peering_event: epoch_sent: 122252 epoch_requested: 122252 NullEvt
Aug 21 20:34:37 ceph-host14 ceph-osd[40533]: 2021-08-21 20:34:37.397 7f742eb01700 10 log is not dirty
Aug 21 20:34:37 ceph-host14 ceph-osd[40533]: 2021-08-21 20:34:37.397 7f742eb01700 20 osd.199 122252 dispatch_context not up in osdmap
Aug 21 20:34:37 ceph-host14 ceph-osd[40533]: 2021-08-21 20:34:37.397 7f742eb01700 20 osd.199 op_wq(3) _process empty q, waiting

--------

Thanks in advance!

Philip
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux