Re: Need some help/advice upgrading Hammer to Jewel - HEALTH_ERR shutting down OSD

Eric van Blokland <ericvanblokland@xxxxxxxxx> · Thu, 28 Sep 2017 11:40:59 +0200

David,
Thank you so much for your reply. I'm not entirely satisfied though. I'm expecting the PG states "degraded" and "undersized". Those should result in a HEALTH_WARN. I'm particularly worried about the "stuck inactive" part. Please correct me if I'm wrong but I was in the understanding that a PG would only get in that state if all OSDs that have that PG mapped are down.

Even if the cluster would recover immediately after updating and bringing the OSDs back up, I really wouldn't feel comfortable doing this while the cluster is online and being used. 
I think I'll schedule downtime and do an offline upgrade instead, just to be safe. Nonetheless I would really like to know what is wrong with either this cluster or my understanding of Ceph.

Below the ceph -s for my test environment. I would expect the production cluster to act the same. I also find it odd the test setup didn't get the tunables warning. Both clusters were running a Hammer release when initialized, probably not the exact same versions though.

health HEALTH_WARN
            53 pgs degraded
            53 pgs stuck degraded
            67 pgs stuck unclean
            53 pgs stuck undersized
            53 pgs undersized
            recovery 28/423 objects degraded (6.619%)
     monmap e3: 3 mons at {mgm1=10.10.100.21:6789/0,mgm2=10.10.100.22:6789/0,mgm3=10.10.100.23:6789/0}
            election epoch 40, quorum 0,1,2 mgm1,mgm2,mgm3
     osdmap e163: 6 osds: 4 up, 4 in; 14 remapped pgs
      pgmap v2320: 96 pgs, 2 pools, 514 MB data, 141 objects
            1638 MB used, 100707 MB / 102345 MB avail
            28/423 objects degraded (6.619%)
                  53 active+undersized+degraded
                  29 active+clean
                  14 active+remapped

Kind regards,

Eric van Blokland

On Thu, Sep 28, 2017 at 3:02 AM, David Turner <drakonstein@xxxxxxxxx> wrote:
There are new PG states that cause health_err. In this case it is undersized that is causing this state.
While I decided to upgrade my tunables before upgrading the rest of my cluster, it does not seem to be a requirement. However I would recommend upgrading them sooner than later. It will cause a fair amount of backfilling when you do it. If you are using krbd, don't upgrade your tunables past Hammer.
In any case, you should feel safe continuing with your upgrade. You will definitely be safe to finish this first node as you have 2 copies of your data if anything goes awry. I would say that this first node will finish and get back to a state where all backfilling is done and you can continue with the other nodes.

On Wed, Sep 27, 2017, 6:32 PM Eric van Blokland <ericvanblokland@xxxxxxxxx> wrote:
Hello,
I have run into an issue while upgrading a Ceph cluster from Hammer to Jewel on CentOS. It's a small cluster with 3 monitoring servers and a humble 6 OSDs distributed over 3 servers.

I've upgraded the 3 monitors successfully to 10.2.7. They appear to be running fine except for this health warning: "crush map has legacy tunables (require bobtail, min is firefly)". While I might completely underestimate the significance of this warning, it seemed pretty harmless to me and I decided to upgrade my OSDs (running 0.94.10) before touching the tunables. 

However, as soon as I brought down the OSDs on the first storage server to start upgrading them, the cluster immediately got a HEALTH_ERR status (see ceph -s output below) which made me abort to update process and just start the OSDs again.

Now considering that my crushmap forces distribution of 3 copies over 3 servers, the cluster can't heal itself when I take those OSDs down, which would justify an error status. I'm worried however because my memory and my lab environment tell me that this situation should only give a health warning and only degraded PGs, not stuck/inactive (or did my lab environment not get the stuck pgs because they were not being addressed?). 

     health HEALTH_ERR
            199 pgs are stuck inactive for more than 300 seconds
            576 pgs degraded
            199 pgs stuck inactive
            238 pgs stuck unclean
            576 pgs undersized
            recovery 1415496/4246488 objects degraded (33.333%)
            2/6 in osds are down
            crush map has legacy tunables (require bobtail, min is firefly)
     monmap e1: 3 mons at {mgm1=10.10.3.11:6789/0,mgm2=10.10.3.12:6789/0,mgm3=10.10.3.13:6789/0}
            election epoch 1650, quorum 0,1,2 mgm1,mgm2,mgm3
     osdmap e808: 6 osds: 4 up, 6 in; 576 remapped pgs
      pgmap v4309615: 576 pgs, 5 pools, 1483 GB data, 1382 kobjects
            4445 GB used, 7836 GB / 12281 GB avail
            1415496/4246488 objects degraded (33.333%)
                 512 undersized+degraded+peered
                  64 active+undersized+degraded

How should I proceed from here? Am I seeing ghosts, is the HEALTH_ERR status to be expected and should I just continue or is something definitively wrong here?

On a side node: the timer for the stuck/inactive PGs is instantly 300 seconds, right after shutting down the OSDs.

Any help would be greatly appreciated.

Kind regards,

Eric van Blokland
_______________________________________________

ceph-users mailing list

ceph-users@xxxxxxxxxxxxxx

http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com