Re: Luminous, OSDs down: "osd init failed" and "failed to load OSD map for epoch ... got 0 bytes"

Dan van der Ster <dan@xxxxxxxxxxxxxx> · Fri, 22 May 2020 08:50:25 +0200

Hi Fulvio,

The symptom of several osd's all asserting at the same time in
OSDMap::get_map really sounds like this bug:
https://tracker.ceph.com/issues/39525

lz4 compression is buggy on CentOS 7 and Ubuntu 18.04 -- you need to
disable compression or use a different algorithm. Mimic and nautilus
will get a workaround, but it's not planned to be backported to
luminous.

-- Dan

On Thu, May 21, 2020 at 11:18 PM Fulvio Galeazzi
<fulvio.galeazzi@xxxxxxx> wrote:
>
> Hallo all,
>       hope you can help me with very strange problems which arose
> suddenly today. Tried to search, also in this mailing list, but could
> not find anything relevant.
>
> At some point today, without any action from my side, I noticed some
> OSDs in my production cluster would go down and never come up.
> I am on Luminous 12.2.13, CentOS7, kernel 3.10: my setup is non-standard
> as OSD disks are served off a SAN (which is for sure OK now, although I
> cannot exclude some glitch).
> Tried to reboot OSD servers a few times, ran "activate --all", added
> bluestore_ignore_data_csum=true in the [osd] section in ceph.conf...
> the number of "down" OSDs changed for a while but now seems rather stable.
>
>
> There are actually two classes of problems (bit more details right below):
> - ERROR: osd init failed: (5) Input/output error
> - failed to load OSD map for epoch 141282, got 0 bytes
>
>
> *First problem*
> This affects 50 OSDs (all disks of this kind, on all but one server):
> these OSDs are reserved for object storage but I am not yet using them
> so I may in principle recreate them. But would be interested in
> understanding what the problem is, and learn how to solve it for future
> reference.
> Here is what I see in logs:
> .....
> 2020-05-21 21:17:48.661348 7fa2e9a95ec0  1 bluefs add_block_device bdev
> 1 path /var/lib/ceph/osd/cephpa1-72/block size 14.5TiB
> 2020-05-21 21:17:48.661428 7fa2e9a95ec0  1 bluefs mount
> 2020-05-21 21:17:48.662040 7fa2e9a95ec0  1 bluefs _init_alloc id 1
> alloc_size 0x10000 size 0xe83a3400000
> 2020-05-21 21:52:43.858464 7fa2e9a95ec0 -1 bluefs mount failed to replay
> log: (5) Input/output error
> 2020-05-21 21:52:43.858589 7fa2e9a95ec0  1 fbmap_alloc 0x55c6bba92e00
> shutdown
> 2020-05-21 21:52:43.858728 7fa2e9a95ec0 -1
> bluestore(/var/lib/ceph/osd/cephpa1-72) _open_db failed bluefs mount:
> (5) Input/output error
> 2020-05-21 21:52:43.858790 7fa2e9a95ec0  1 bdev(0x55c6bbdb6600
> /var/lib/ceph/osd/cephpa1-72/block) close
> 2020-05-21 21:52:44.103536 7fa2e9a95ec0  1 bdev(0x55c6bbdb8600
> /var/lib/ceph/osd/cephpa1-72/block) close
> 2020-05-21 21:52:44.352899 7fa2e9a95ec0 -1 osd.72 0 OSD:init: unable to
> mount object store
> 2020-05-21 21:52:44.352956 7fa2e9a95ec0 -1 ESC[0;31m ** ERROR: osd init
> failed: (5) Input/output errorESC[0m
>
> *Second problem*
> This affects 11 OSDs, which I use *in production* for Cinder block
> storage: looks like all PGs for this pool are currently OK.
> Here is the excerpt from the logs.
> .....
>       -5> 2020-05-21 20:52:06.756469 7fd2ccc19ec0  0 _get_class not
> permitted to load kvs
>       -4> 2020-05-21 20:52:06.759686 7fd2ccc19ec0  1 <cls>
> /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.2.13/rpm/el7/BUILD/ceph-12.2.13/src/cls/rgw/cls_rgw.cc:3869:
>
> Loaded rgw class!
>       -3> 2020-05-21 20:52:06.760021 7fd2ccc19ec0  1 <cls>
> /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.2.13/rpm/el7/BUILD/ceph-12.2.13/src/cls/log/cls_log.cc:299:
>
> Loaded log class!
>       -2> 2020-05-21 20:52:06.760730 7fd2ccc19ec0  1 <cls>
> /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.2.13/rpm/el7/BUILD/ceph-12.2.13/src/cls/replica_log/cls_replica_log.cc:135:
>
> Loaded replica log class!
>       -1> 2020-05-21 20:52:06.760873 7fd2ccc19ec0 -1 osd.63 0 failed to
> load OSD map for epoch 141282, got 0 bytes
>        0> 2020-05-21 20:52:06.763277 7fd2ccc19ec0 -1
> /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.2.13/rpm/el7/BUILD/ceph-12.2.13/src/osd/OSD.h:
>
> In function 'OSDMapRef OSDService::get_map(epoch_t)' thread 7fd2ccc19ec0
> time 2020-05-21 20:52:06.760916
> /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.2.13/rpm/el7/BUILD/ceph-12.2.13/src/osd/OSD.h:
>
> 994: FAILED assert(ret)
>
>     Has anyone any idea how I could fix these problems, or what I could
> do to try and shed some light? And also, what caused them, and whether
> there is some magic configuration flag I could use to protect my cluster?
>
>     Thanks a lot for your help!
>
>               Fulvio
>
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx