Re: Luminous, OSDs down: "osd init failed" and "failed to load OSD map for epoch ... got 0 bytes"

Dan van der Ster <dan@xxxxxxxxxxxxxx> · Fri, 22 May 2020 13:57:33 +0200

The procedure to overwrite a corrupted osdmap on a given osd is
described at http://lists.ceph.com/pipermail/ceph-users-ceph.com/2019-August/036592.html
I wouldn't do that type of low level manipulation just yet -- better
to understand the root cause of he corruptions first before
potentially making things worse.

There is also this issue which seems related:
https://tracker.ceph.com/issues/24423
(It has a fix in mimic and nautilus).

Could you share some more logs e.g. with the full backtrace from the
time they first crashed, and now failing to start. And maybe
/var/log/messages shows crc mismatches?

-- dan

On Fri, May 22, 2020 at 1:02 PM Fulvio Galeazzi <fulvio.galeazzi@xxxxxxx> wrote:
>
> Hallo Dan, thanks for your reply! Very good to know about compression...
> will not try to use it before upgrading to Nautilus.
>
> Problem is, I did not activate it on this cluster (see below).
> Moreover, that would only account for the issue on disks dedicated to
> object storage, if I understand it correctly.

The symptoms of your "*Second problem*" look similar, failing to load
an osdmap. TBH though, BTW, I believe that that lz4 bug might only
corrupt an osdmap when compression is enabled for the whole OSD, i.e.
with bluestore_compression_mode=aggressive, not when compression is
enabled just for specific pools.
Alas, if you didn't have that enabled, then your bug must be something else.

> Another important info which I forgot in the previous message: my SAN is
> actually composed of 6 independent chains, such that a glitch at the SAN
> level is hardly the culprit, while something along the lines of the bug
> you pointed me to sounds more reasonable.
>
> Also, how to deal with the "failed to load with the OSD map"?
>
>    Thanks!
>
>                         Fulvio
>
>
> [root@r1srv05.pa1 ~]# grep compress /etc/ceph/cephpa1.conf
> [root@r1srv05.pa1 ~]#
> [root@r1srv05.pa1 ~]# ceph --cluster cephpa1 osd pool ls | sort | xargs
> -i ceph --cluster ceph osd pool get {} compression_mode
> Error ENOENT: option 'compression_mode' is not set on pool 'cephfs_data'
> Error ENOENT: option 'compression_mode' is not set on pool 'cephfs_metadata'
> Error ENOENT: option 'compression_mode' is not set on pool
> 'cinder-ceph-ec-pa1-cl1'
> Error ENOENT: option 'compression_mode' is not set on pool
> 'cinder-ceph-ec-pa1-cl1-cache'
> Error ENOENT: option 'compression_mode' is not set on pool
> 'cinder-ceph-pa1-cl1'
> Error ENOENT: option 'compression_mode' is not set on pool
> 'cinder-ceph-pa1-devel'
> Error ENOENT: option 'compression_mode' is not set on pool
> 'cinder-ceph-rr-pa1-cl1'
> Error ENOENT: option 'compression_mode' is not set on pool
> 'default.rgw.buckets.data'
> Error ENOENT: option 'compression_mode' is not set on pool
> 'default.rgw.buckets.index'
> Error ENOENT: option 'compression_mode' is not set on pool
> 'default.rgw.control'
> Error ENOENT: option 'compression_mode' is not set on pool
> 'default.rgw.data.root'
> Error ENOENT: option 'compression_mode' is not set on pool 'default.rgw.gc'
> Error ENOENT: option 'compression_mode' is not set on pool 'default.rgw.log'
> Error ENOENT: option 'compression_mode' is not set on pool
> 'default.rgw.users.keys'
> Error ENOENT: option 'compression_mode' is not set on pool
> 'default.rgw.users.uid'
> Error ENOENT: option 'compression_mode' is not set on pool 'ec-pool-fg'
> Error ENOENT: option 'compression_mode' is not set on pool 'glance-ct1-cl1'
> Error ENOENT: option 'compression_mode' is not set on pool 'glance-pa1-cl1'
> Error ENOENT: option 'compression_mode' is not set on pool
> 'glance-pa1-devel'
> Error ENOENT: option 'compression_mode' is not set on pool 'gnocchi-pa1-cl1'
> Error ENOENT: option 'compression_mode' is not set on pool 'iscsivolumes'
> Error ENOENT: option 'compression_mode' is not set on pool 'k8s'
> Error ENOENT: option 'compression_mode' is not set on pool 'rbd'
> Error ENOENT: option 'compression_mode' is not set on pool '.rgw.root'
> Error ENOENT: option 'compression_mode' is not set on pool
> 'testrgw.intent-log'
> Error ENOENT: option 'compression_mode' is not set on pool 'testrgw.log'
> Error ENOENT: option 'compression_mode' is not set on pool 'testrgw.rgw'
> Error ENOENT: option 'compression_mode' is not set on pool
> 'testrgw.rgw.buckets'
> Error ENOENT: option 'compression_mode' is not set on pool
> 'testrgw.rgw.buckets.extra'
> Error ENOENT: option 'compression_mode' is not set on pool
> 'testrgw.rgw.buckets.index'
> Error ENOENT: option 'compression_mode' is not set on pool
> 'testrgw.rgw.control'
> Error ENOENT: option 'compression_mode' is not set on pool 'testrgw.rgw.gc'
> Error ENOENT: option 'compression_mode' is not set on pool
> 'testrgw.rgw.root'
> Error ENOENT: option 'compression_mode' is not set on pool 'testrgw.usage'
> Error ENOENT: option 'compression_mode' is not set on pool 'testrgw.users'
> Error ENOENT: option 'compression_mode' is not set on pool
> 'testrgw.users.email'
> Error ENOENT: option 'compression_mode' is not set on pool
> 'testrgw.users.swift'
> Error ENOENT: option 'compression_mode' is not set on pool
> 'testrgw.users.uid'
>
>
> Il 5/22/2020 8:50 AM, Dan van der Ster ha scritto:
> > Hi Fulvio,
> >
> > The symptom of several osd's all asserting at the same time in
> > OSDMap::get_map really sounds like this bug:
> > https://tracker.ceph.com/issues/39525
> >
> > lz4 compression is buggy on CentOS 7 and Ubuntu 18.04 -- you need to
> > disable compression or use a different algorithm. Mimic and nautilus
> > will get a workaround, but it's not planned to be backported to
> > luminous.
> >
> > -- Dan
> >
> > On Thu, May 21, 2020 at 11:18 PM Fulvio Galeazzi
> > <fulvio.galeazzi@xxxxxxx> wrote:
> >>
> >> Hallo all,
> >>        hope you can help me with very strange problems which arose
> >> suddenly today. Tried to search, also in this mailing list, but could
> >> not find anything relevant.
> >>
> >> At some point today, without any action from my side, I noticed some
> >> OSDs in my production cluster would go down and never come up.
> >> I am on Luminous 12.2.13, CentOS7, kernel 3.10: my setup is non-standard
> >> as OSD disks are served off a SAN (which is for sure OK now, although I
> >> cannot exclude some glitch).
> >> Tried to reboot OSD servers a few times, ran "activate --all", added
> >> bluestore_ignore_data_csum=true in the [osd] section in ceph.conf...
> >> the number of "down" OSDs changed for a while but now seems rather stable.
> >>
> >>
> >> There are actually two classes of problems (bit more details right below):
> >> - ERROR: osd init failed: (5) Input/output error
> >> - failed to load OSD map for epoch 141282, got 0 bytes
> >>
> >>
> >> *First problem*
> >> This affects 50 OSDs (all disks of this kind, on all but one server):
> >> these OSDs are reserved for object storage but I am not yet using them
> >> so I may in principle recreate them. But would be interested in
> >> understanding what the problem is, and learn how to solve it for future
> >> reference.
> >> Here is what I see in logs:
> >> .....
> >> 2020-05-21 21:17:48.661348 7fa2e9a95ec0  1 bluefs add_block_device bdev
> >> 1 path /var/lib/ceph/osd/cephpa1-72/block size 14.5TiB
> >> 2020-05-21 21:17:48.661428 7fa2e9a95ec0  1 bluefs mount
> >> 2020-05-21 21:17:48.662040 7fa2e9a95ec0  1 bluefs _init_alloc id 1
> >> alloc_size 0x10000 size 0xe83a3400000
> >> 2020-05-21 21:52:43.858464 7fa2e9a95ec0 -1 bluefs mount failed to replay
> >> log: (5) Input/output error
> >> 2020-05-21 21:52:43.858589 7fa2e9a95ec0  1 fbmap_alloc 0x55c6bba92e00
> >> shutdown
> >> 2020-05-21 21:52:43.858728 7fa2e9a95ec0 -1
> >> bluestore(/var/lib/ceph/osd/cephpa1-72) _open_db failed bluefs mount:
> >> (5) Input/output error
> >> 2020-05-21 21:52:43.858790 7fa2e9a95ec0  1 bdev(0x55c6bbdb6600
> >> /var/lib/ceph/osd/cephpa1-72/block) close
> >> 2020-05-21 21:52:44.103536 7fa2e9a95ec0  1 bdev(0x55c6bbdb8600
> >> /var/lib/ceph/osd/cephpa1-72/block) close
> >> 2020-05-21 21:52:44.352899 7fa2e9a95ec0 -1 osd.72 0 OSD:init: unable to
> >> mount object store
> >> 2020-05-21 21:52:44.352956 7fa2e9a95ec0 -1 ESC[0;31m ** ERROR: osd init
> >> failed: (5) Input/output errorESC[0m
> >>
> >> *Second problem*
> >> This affects 11 OSDs, which I use *in production* for Cinder block
> >> storage: looks like all PGs for this pool are currently OK.
> >> Here is the excerpt from the logs.
> >> .....
> >>        -5> 2020-05-21 20:52:06.756469 7fd2ccc19ec0  0 _get_class not
> >> permitted to load kvs
> >>        -4> 2020-05-21 20:52:06.759686 7fd2ccc19ec0  1 <cls>
> >> /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.2.13/rpm/el7/BUILD/ceph-12.2.13/src/cls/rgw/cls_rgw.cc:3869:
> >>
> >> Loaded rgw class!
> >>        -3> 2020-05-21 20:52:06.760021 7fd2ccc19ec0  1 <cls>
> >> /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.2.13/rpm/el7/BUILD/ceph-12.2.13/src/cls/log/cls_log.cc:299:
> >>
> >> Loaded log class!
> >>        -2> 2020-05-21 20:52:06.760730 7fd2ccc19ec0  1 <cls>
> >> /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.2.13/rpm/el7/BUILD/ceph-12.2.13/src/cls/replica_log/cls_replica_log.cc:135:
> >>
> >> Loaded replica log class!
> >>        -1> 2020-05-21 20:52:06.760873 7fd2ccc19ec0 -1 osd.63 0 failed to
> >> load OSD map for epoch 141282, got 0 bytes
> >>         0> 2020-05-21 20:52:06.763277 7fd2ccc19ec0 -1
> >> /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.2.13/rpm/el7/BUILD/ceph-12.2.13/src/osd/OSD.h:
> >>
> >> In function 'OSDMapRef OSDService::get_map(epoch_t)' thread 7fd2ccc19ec0
> >> time 2020-05-21 20:52:06.760916
> >> /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.2.13/rpm/el7/BUILD/ceph-12.2.13/src/osd/OSD.h:
> >>
> >> 994: FAILED assert(ret)
> >>
> >>      Has anyone any idea how I could fix these problems, or what I could
> >> do to try and shed some light? And also, what caused them, and whether
> >> there is some magic configuration flag I could use to protect my cluster?
> >>
> >>      Thanks a lot for your help!
> >>
> >>                Fulvio
> >>
> >> _______________________________________________
> >> ceph-users mailing list -- ceph-users@xxxxxxx
> >> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>
> --
> Fulvio Galeazzi
> GARR-CSD Department
> skype: fgaleazzi70
> tel.: +39-334-6533-250
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx