The procedure to overwrite a corrupted osdmap on a given osd is described at http://lists.ceph.com/pipermail/ceph-users-ceph.com/2019-August/036592.html I wouldn't do that type of low level manipulation just yet -- better to understand the root cause of he corruptions first before potentially making things worse. There is also this issue which seems related: https://tracker.ceph.com/issues/24423 (It has a fix in mimic and nautilus). Could you share some more logs e.g. with the full backtrace from the time they first crashed, and now failing to start. And maybe /var/log/messages shows crc mismatches? -- dan On Fri, May 22, 2020 at 1:02 PM Fulvio Galeazzi <fulvio.galeazzi@xxxxxxx> wrote: > > Hallo Dan, thanks for your reply! Very good to know about compression... > will not try to use it before upgrading to Nautilus. > > Problem is, I did not activate it on this cluster (see below). > Moreover, that would only account for the issue on disks dedicated to > object storage, if I understand it correctly. The symptoms of your "*Second problem*" look similar, failing to load an osdmap. TBH though, BTW, I believe that that lz4 bug might only corrupt an osdmap when compression is enabled for the whole OSD, i.e. with bluestore_compression_mode=aggressive, not when compression is enabled just for specific pools. Alas, if you didn't have that enabled, then your bug must be something else. > Another important info which I forgot in the previous message: my SAN is > actually composed of 6 independent chains, such that a glitch at the SAN > level is hardly the culprit, while something along the lines of the bug > you pointed me to sounds more reasonable. > > Also, how to deal with the "failed to load with the OSD map"? > > Thanks! > > Fulvio > > > [root@r1srv05.pa1 ~]# grep compress /etc/ceph/cephpa1.conf > [root@r1srv05.pa1 ~]# > [root@r1srv05.pa1 ~]# ceph --cluster cephpa1 osd pool ls | sort | xargs > -i ceph --cluster ceph osd pool get {} compression_mode > Error ENOENT: option 'compression_mode' is not set on pool 'cephfs_data' > Error ENOENT: option 'compression_mode' is not set on pool 'cephfs_metadata' > Error ENOENT: option 'compression_mode' is not set on pool > 'cinder-ceph-ec-pa1-cl1' > Error ENOENT: option 'compression_mode' is not set on pool > 'cinder-ceph-ec-pa1-cl1-cache' > Error ENOENT: option 'compression_mode' is not set on pool > 'cinder-ceph-pa1-cl1' > Error ENOENT: option 'compression_mode' is not set on pool > 'cinder-ceph-pa1-devel' > Error ENOENT: option 'compression_mode' is not set on pool > 'cinder-ceph-rr-pa1-cl1' > Error ENOENT: option 'compression_mode' is not set on pool > 'default.rgw.buckets.data' > Error ENOENT: option 'compression_mode' is not set on pool > 'default.rgw.buckets.index' > Error ENOENT: option 'compression_mode' is not set on pool > 'default.rgw.control' > Error ENOENT: option 'compression_mode' is not set on pool > 'default.rgw.data.root' > Error ENOENT: option 'compression_mode' is not set on pool 'default.rgw.gc' > Error ENOENT: option 'compression_mode' is not set on pool 'default.rgw.log' > Error ENOENT: option 'compression_mode' is not set on pool > 'default.rgw.users.keys' > Error ENOENT: option 'compression_mode' is not set on pool > 'default.rgw.users.uid' > Error ENOENT: option 'compression_mode' is not set on pool 'ec-pool-fg' > Error ENOENT: option 'compression_mode' is not set on pool 'glance-ct1-cl1' > Error ENOENT: option 'compression_mode' is not set on pool 'glance-pa1-cl1' > Error ENOENT: option 'compression_mode' is not set on pool > 'glance-pa1-devel' > Error ENOENT: option 'compression_mode' is not set on pool 'gnocchi-pa1-cl1' > Error ENOENT: option 'compression_mode' is not set on pool 'iscsivolumes' > Error ENOENT: option 'compression_mode' is not set on pool 'k8s' > Error ENOENT: option 'compression_mode' is not set on pool 'rbd' > Error ENOENT: option 'compression_mode' is not set on pool '.rgw.root' > Error ENOENT: option 'compression_mode' is not set on pool > 'testrgw.intent-log' > Error ENOENT: option 'compression_mode' is not set on pool 'testrgw.log' > Error ENOENT: option 'compression_mode' is not set on pool 'testrgw.rgw' > Error ENOENT: option 'compression_mode' is not set on pool > 'testrgw.rgw.buckets' > Error ENOENT: option 'compression_mode' is not set on pool > 'testrgw.rgw.buckets.extra' > Error ENOENT: option 'compression_mode' is not set on pool > 'testrgw.rgw.buckets.index' > Error ENOENT: option 'compression_mode' is not set on pool > 'testrgw.rgw.control' > Error ENOENT: option 'compression_mode' is not set on pool 'testrgw.rgw.gc' > Error ENOENT: option 'compression_mode' is not set on pool > 'testrgw.rgw.root' > Error ENOENT: option 'compression_mode' is not set on pool 'testrgw.usage' > Error ENOENT: option 'compression_mode' is not set on pool 'testrgw.users' > Error ENOENT: option 'compression_mode' is not set on pool > 'testrgw.users.email' > Error ENOENT: option 'compression_mode' is not set on pool > 'testrgw.users.swift' > Error ENOENT: option 'compression_mode' is not set on pool > 'testrgw.users.uid' > > > Il 5/22/2020 8:50 AM, Dan van der Ster ha scritto: > > Hi Fulvio, > > > > The symptom of several osd's all asserting at the same time in > > OSDMap::get_map really sounds like this bug: > > https://tracker.ceph.com/issues/39525 > > > > lz4 compression is buggy on CentOS 7 and Ubuntu 18.04 -- you need to > > disable compression or use a different algorithm. Mimic and nautilus > > will get a workaround, but it's not planned to be backported to > > luminous. > > > > -- Dan > > > > On Thu, May 21, 2020 at 11:18 PM Fulvio Galeazzi > > <fulvio.galeazzi@xxxxxxx> wrote: > >> > >> Hallo all, > >> hope you can help me with very strange problems which arose > >> suddenly today. Tried to search, also in this mailing list, but could > >> not find anything relevant. > >> > >> At some point today, without any action from my side, I noticed some > >> OSDs in my production cluster would go down and never come up. > >> I am on Luminous 12.2.13, CentOS7, kernel 3.10: my setup is non-standard > >> as OSD disks are served off a SAN (which is for sure OK now, although I > >> cannot exclude some glitch). > >> Tried to reboot OSD servers a few times, ran "activate --all", added > >> bluestore_ignore_data_csum=true in the [osd] section in ceph.conf... > >> the number of "down" OSDs changed for a while but now seems rather stable. > >> > >> > >> There are actually two classes of problems (bit more details right below): > >> - ERROR: osd init failed: (5) Input/output error > >> - failed to load OSD map for epoch 141282, got 0 bytes > >> > >> > >> *First problem* > >> This affects 50 OSDs (all disks of this kind, on all but one server): > >> these OSDs are reserved for object storage but I am not yet using them > >> so I may in principle recreate them. But would be interested in > >> understanding what the problem is, and learn how to solve it for future > >> reference. > >> Here is what I see in logs: > >> ..... > >> 2020-05-21 21:17:48.661348 7fa2e9a95ec0 1 bluefs add_block_device bdev > >> 1 path /var/lib/ceph/osd/cephpa1-72/block size 14.5TiB > >> 2020-05-21 21:17:48.661428 7fa2e9a95ec0 1 bluefs mount > >> 2020-05-21 21:17:48.662040 7fa2e9a95ec0 1 bluefs _init_alloc id 1 > >> alloc_size 0x10000 size 0xe83a3400000 > >> 2020-05-21 21:52:43.858464 7fa2e9a95ec0 -1 bluefs mount failed to replay > >> log: (5) Input/output error > >> 2020-05-21 21:52:43.858589 7fa2e9a95ec0 1 fbmap_alloc 0x55c6bba92e00 > >> shutdown > >> 2020-05-21 21:52:43.858728 7fa2e9a95ec0 -1 > >> bluestore(/var/lib/ceph/osd/cephpa1-72) _open_db failed bluefs mount: > >> (5) Input/output error > >> 2020-05-21 21:52:43.858790 7fa2e9a95ec0 1 bdev(0x55c6bbdb6600 > >> /var/lib/ceph/osd/cephpa1-72/block) close > >> 2020-05-21 21:52:44.103536 7fa2e9a95ec0 1 bdev(0x55c6bbdb8600 > >> /var/lib/ceph/osd/cephpa1-72/block) close > >> 2020-05-21 21:52:44.352899 7fa2e9a95ec0 -1 osd.72 0 OSD:init: unable to > >> mount object store > >> 2020-05-21 21:52:44.352956 7fa2e9a95ec0 -1 ESC[0;31m ** ERROR: osd init > >> failed: (5) Input/output errorESC[0m > >> > >> *Second problem* > >> This affects 11 OSDs, which I use *in production* for Cinder block > >> storage: looks like all PGs for this pool are currently OK. > >> Here is the excerpt from the logs. > >> ..... > >> -5> 2020-05-21 20:52:06.756469 7fd2ccc19ec0 0 _get_class not > >> permitted to load kvs > >> -4> 2020-05-21 20:52:06.759686 7fd2ccc19ec0 1 <cls> > >> /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.2.13/rpm/el7/BUILD/ceph-12.2.13/src/cls/rgw/cls_rgw.cc:3869: > >> > >> Loaded rgw class! > >> -3> 2020-05-21 20:52:06.760021 7fd2ccc19ec0 1 <cls> > >> /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.2.13/rpm/el7/BUILD/ceph-12.2.13/src/cls/log/cls_log.cc:299: > >> > >> Loaded log class! > >> -2> 2020-05-21 20:52:06.760730 7fd2ccc19ec0 1 <cls> > >> /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.2.13/rpm/el7/BUILD/ceph-12.2.13/src/cls/replica_log/cls_replica_log.cc:135: > >> > >> Loaded replica log class! > >> -1> 2020-05-21 20:52:06.760873 7fd2ccc19ec0 -1 osd.63 0 failed to > >> load OSD map for epoch 141282, got 0 bytes > >> 0> 2020-05-21 20:52:06.763277 7fd2ccc19ec0 -1 > >> /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.2.13/rpm/el7/BUILD/ceph-12.2.13/src/osd/OSD.h: > >> > >> In function 'OSDMapRef OSDService::get_map(epoch_t)' thread 7fd2ccc19ec0 > >> time 2020-05-21 20:52:06.760916 > >> /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.2.13/rpm/el7/BUILD/ceph-12.2.13/src/osd/OSD.h: > >> > >> 994: FAILED assert(ret) > >> > >> Has anyone any idea how I could fix these problems, or what I could > >> do to try and shed some light? And also, what caused them, and whether > >> there is some magic configuration flag I could use to protect my cluster? > >> > >> Thanks a lot for your help! > >> > >> Fulvio > >> > >> _______________________________________________ > >> ceph-users mailing list -- ceph-users@xxxxxxx > >> To unsubscribe send an email to ceph-users-leave@xxxxxxx > > -- > Fulvio Galeazzi > GARR-CSD Department > skype: fgaleazzi70 > tel.: +39-334-6533-250 > _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx