Thanks Hu Weiwen, These hosts and drives are perhaps 2 months old or so, and this is the first cluster we build on them so I was not anticipating a drive issue already. The smartmontools show: root@<HOST>:~# smartctl -H /dev/sdag smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.11.0-38-generic] (local build) Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org === START OF READ SMART DATA SECTION === SMART Health Status: FIRMWARE IMPENDING FAILURE TOO MANY BLOCK REASSIGNS [asc=5d, ascq=64] Grown defects during certification <not available> Total blocks reassigned during format <not available> Total new blocks reassigned <not available> Power on minutes since format <not available> root@<HOST>:~# smartctl -H /dev/sdah smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.11.0-38-generic] (local build) Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org On Wed, Oct 27, 2021 at 1:26 PM 胡 玮文 <huww98@xxxxxxxxxxx> wrote: > Hi Marco, the log lines are truncated. I recommend you to send the logs to > a file rather than copying from terminal: > > > > cephadm logs --name osd.13 > osd.13.log > > > > I see “read stalled” in the log. Just a guess, can you check the kernel > logs and the SMART info to see if there is something wrong with this disk? > Maybe also do a self-test. > > > > 从 Windows 版邮件 <https://go.microsoft.com/fwlink/?LinkId=550986>发送 > > > > *发件人: *Marco Pizzolo <marcopizzolo@xxxxxxxxx> > *发送时间: *2021年10月28日 1:17 > *收件人: *胡 玮文 <huww98@xxxxxxxxxxx> > *抄送: *ceph-users <ceph-users@xxxxxxx> > *主题: *Re: 16.2.6 OSD down, out but container running.... > > > > Is there any command or log I can provide a sample from that would help to > pinpoint the issue? The 119 of 120 OSDs are working correctly by all > accounts, but I am just unable to have the bring the last one fully online. > > > > Thank you, > _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx