Re: cephadm upgrade 16.2.10 to 16.2.11: osds crash and get stuck restarting

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Konstantin,

Many thanks for your response! That is the funny part: the logs on both
hosts do not indicate that anything happened to any devices at all, those
related to the OSDs which failed to start or otherwise. The only useful
message was from the OSD debug logs:

"debug     -3> 2023-01-25T23:07:52.990+0000 7f7c8b66e700 -1
bdev(0x5619bf0ce400 /var/lib/ceph/osd/ceph-3/block) _aio_thread got r=-1
((1) Operation not permitted)".

It was the same for all 3 affected OSDs on 2 different hosts and went away
after I rebooted the hosts, then I had no issues restarting these OSDs
multiple times after the reboot.

/Z

On Thu, 26 Jan 2023 at 05:41, Konstantin Shalygin <k0ste@xxxxxxxx> wrote:

> Hi Zakhar,
>
> On 26 Jan 2023, at 08:33, Zakhar Kirpichenko <zakhar@xxxxxxxxx> wrote:
>
> Jan 25 23:07:53 ceph01 bash[2553123]:
>
> /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/16.2.11/rpm/el8/BUILD/ceph-16.2.11/src/blk/kernel/
> KernelDevice.cc <http://kerneldevice.cc/>:
> 604: ceph_abort_msg("Unexpected IO error. This may suggest HW issue. Please
> check your dmesg!")
>
>
> You can check your kmesg for a messages via `journalctl -kl
> --since=yesterday` to see what actually happened with this device
>
>
> k
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux