Re: OSD booting gets stuck after log_to_monitors step

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Dear experts,
Sorry, I missed to mention that the initial symptom is that those OSDs will suffer: "wait_auth_rotating timed out" and "unable to obtain rotating service keys; retrying" I then increased rotating_keys_bootstrap_timeout, but it doesn't really help.


Best regards,
Felix Lee ~

On 12/1/22 11:43, Felix Lee wrote:
Dear experts,
Recently, we suffered network problem due to switch H/W failure and caused massive OSD offline. However, after the recovery of network, several of OSD are unable to join back, resulting some down or unknown PGs. Restarting OSD doesn't help. It will end up with sticking at booting process, specifically, after "log_to_monitors {default=true}" step. I have no idea how to debug this issue. Turning debug_osd to 20, only shows following:


2022-12-01T03:32:13.065+0000 7f37ed19e700  5 osd.13 1045417 heartbeat osd_stat(store_statfs(0x198fa330000/0xd13420000/0x1b4a39191000, data 0x1941ad7e6764/0x19a42ba30000, compress 0x0/0x0/0x0, omap 0x5e4f268f, meta 0xcb4f2d971), peers [] op hist []) 2022-12-01T03:32:13.066+0000 7f37ed19e700 20 osd.13 1045417 check_full_status cur ratio 0.941459, physical ratio 0.941459, new state nearfull
2022-12-01T03:32:13.068+0000 7f3808a53700 10 osd.13 1045417 tick
2022-12-01T03:32:13.068+0000 7f3808a53700 10 osd.13 1045417 do_waiters -- start 2022-12-01T03:32:13.068+0000 7f3808a53700 10 osd.13 1045417 do_waiters -- finish 2022-12-01T03:32:13.068+0000 7f3808a53700 20 osd.13 1045417 tick last_purged_snaps_scrub 2022-11-30T10:11:21.878157+0000 next 2022-12-01T10:11:21.878157+0000 2022-12-01T03:32:13.342+0000 7f38071d2700 10 osd.13 1045417 tick_without_osd_lock
2022-12-01T03:32:14.065+0000 7f3808a53700 10 osd.13 1045417 tick
2022-12-01T03:32:14.065+0000 7f3808a53700 10 osd.13 1045417 do_waiters -- start 2022-12-01T03:32:14.065+0000 7f3808a53700 10 osd.13 1045417 do_waiters -- finish 2022-12-01T03:32:14.065+0000 7f3808a53700 20 osd.13 1045417 tick last_purged_snaps_scrub 2022-11-30T10:11:21.878157+0000 next 2022-12-01T10:11:21.878157+0000 2022-12-01T03:32:14.353+0000 7f38071d2700 10 osd.13 1045417 tick_without_osd_lock


The ceph version is Octopus: 15.2.17.
OSD storage backend: bluestore
OS: CentOS7 64bit.

Any idea?


Thanks
&
Best regards,
Felix Lee ~

--
Felix H.T Lee                           Academia Sinica Grid & Cloud.
Tel: +886-2-27898308
Office: Room P111, Institute of Physics, 128 Academia Road, Section 2, Nankang, Taipei 115, Taiwan

--
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux