Re: [Ceph-community] After Mimic upgrade OSD's stuck at booting.

by morphin <morphinwithyou@xxxxxxxxx> · Sun, 23 Sep 2018 20:48:54 +0300

I already shared the debug osd=20 logs.
OSD8: https://www.dropbox.com/s/5e01f5odtsq3iqi/ceph-osd.8.log?dl=0
OSD156: https://www.dropbox.com/s/ox7or2uizyiwdo7/ceph-osd.156.log?dl=0

After downgrade back to 12.2.4 I couldn't start OSDs now.
I just downgrade 2 mon 2 mgr and 2 osd. and I just tried to open 2 osd only.

When I start the OSD I just see the log. I was have ceph/osd/* backups
before the mimic upgrade so I tried to restore but no luck. (I have
latest mimic backup too)

2018-09-23 19:58:28.524056 7f06a6f9de80 -1 rocksdb: Corruption: Can't
access /000000.sst: NotFound:
2018-09-23 19:58:28.524059 7f06a6f9de80 -1
bluestore(/var/lib/ceph/osd/ceph-0) _open_db erroring opening db:
2018-09-23 19:58:28.524061 7f06a6f9de80  1 bluefs umount
2018-09-23 19:58:28.524065 7f06a6f9de80  1 stupidalloc 0x0x563f551a7ce0 shutdown
2018-09-23 19:58:28.524067 7f06a6f9de80  1 stupidalloc 0x0x563f551a7d50 shutdown
2018-09-23 19:58:28.524068 7f06a6f9de80  1 stupidalloc 0x0x563f551a7dc0 shutdown
2018-09-23 19:58:28.524088 7f06a6f9de80  1 bdev(0x563f54f27680
/var/lib/ceph/osd/ceph-0/block.wal) close
2018-09-23 19:58:28.884125 7f06a6f9de80  1 bdev(0x563f54f27200
/var/lib/ceph/osd/ceph-0/block.db) close
2018-09-23 19:58:29.134113 7f06a6f9de80  1 bdev(0x563f54f27440
/var/lib/ceph/osd/ceph-0/block) close
2018-09-23 19:58:29.404129 7f06a6f9de80  1 bdev(0x563f54f26fc0
/var/lib/ceph/osd/ceph-0/block) close
2018-09-23 19:58:29.644226 7f06a6f9de80 -1 osd.0 0 OSD:init: unable to
mount object store
2018-09-23 19:58:29.644241 7f06a6f9de80 -1  ** ERROR: osd init failed:
(5) Input/output error
2018-09-23 19:58:50.012987 7f530ba2be80  0 set uid:gid to 992:6 (ceph:disk)
2018-09-23 19:58:50.012997 7f530ba2be80  0 ceph version 12.2.4
(52085d5249a80c5f5121a76d6288429f35e4e77b) luminous (stable), process
(unknown), pid 41841

I have read too many things and some people say there is no Downgrade
option. Some people says Downgrade works.
As you can see my results are not good at all.
I think there is no Downgrade option.
Sage Weil <sage@xxxxxxxxxxxx>, 23 Eyl 2018 Paz, 20:33 tarihinde şunu yazdı:
>
> On Sun, 23 Sep 2018, morph in wrote:
> > Hello. I upgraded my system luminous to mimic
> > I have 168 osd in my system. Im using raid1 nvme for journals. And my pool
> > was healty before upgrade.
> > I'dont upgrade my system with any update tools like apt, pacman.. I'm using
> > images so my all OS are the same and the upgrade was in maintenance mod.
> > Cluster was closed. I tested this upgrade 3 times on test cluster system
> > with 2 server with 12 osd.
> > After upgrade on my prod cluster I see the OSD's are still at booting stage.
> > And It was too fast before mimic when I reboot my cluster.
> > I followed step-by-step mimic upgrade wiki.
> > ceph -s : https://paste.ubuntu.com/p/p2spVmqvJZ/
> > an osd log: https://paste.ubuntu.com/p/PBG66qdHXc/
>
> If they are still stuck booting, can you turn up the osd debug level
> (debug osd = 20) and restart an osd and capture that log?
> (ceph-post-file /var/log/ceph/ceph-osd.NNN.log).
>
> Thanks!
> sage
>
> > ceph daemon status https://paste.ubuntu.com/p/y7cVspr9cN/
> > 1- Why the hell the "ceph -s" shows like that if the osd's booting. Its so
> > stupid and scary. And I didn't even start any mds.
> > 2- Why the booting takes too long? Is it because mimic upgrade or something
> > else?
> > 3- Waiting for the osd boots will be solve my problem or should I do
> > something?
> >
> > -----------------------------
> > ceph mon feature ls
> > all features
> > supported: [kraken,luminous,mimic,osdmap-prune]
> > persistent: [kraken,luminous,mimic,osdmap-prune]
> > on current monmap (epoch 10)
> > persistent: [kraken,luminous,mimic,osdmap-prune]
> > required: [kraken,luminous,mimic,osdmap-prune]
> >
> > ------------------------
> > ceph osd versions
> > {
> >     "ceph version 12.2.4 (52085d5249a80c5f5121a76d6288429f35e4e77b)
> > luminous (stable)": 50
> > }
> >
> > After all Im leaving my cluster in this State. 8 hour later I will be back.
> > I need a running system at monday morning.
> > Help me please.
> >