upgrade from kraken 11.2.0 to 12.2.2 bluestore EC

nokia ceph <nokiacephusers@xxxxxxxxx> · Fri, 8 Dec 2017 21:21:10 +0530

Hello Team,
I having a 5 node cluster running with kraken 11.2.0 EC 4+1. 

My plan is to upgrade all 5 nodes to 12.2.2 Luminous without any downtime. I tried on first node, below procedure. 

commented below directive from ceph.conf
enable experimental unrecoverable data corrupting features = bluestore rocksdb

Then start and enabled  ceph-mgr and then hit a reboot. 

## ceph -s
    cluster b2f1b9b9-eecc-4c17-8b92-cfa60b31c121
     health HEALTH_WARN
            2048 pgs degraded
            2048 pgs stuck degraded
            2048 pgs stuck unclean
            2048 pgs stuck undersized
            2048 pgs undersized
            recovery 1091151/1592070 objects degraded (68.537%)
            24/120 in osds are down
     monmap e2: 5 mons at {PL8-CN1=10.50.11.41:6789/0,PL8-CN2=10.50.11.42:6789/0,PL8-CN3=10.50.11.43:6789/0,PL8-CN4=10.50.11.44:6789/0,PL8-CN5=10.50.11.45:6789/0}
            election epoch 18, quorum 0,1,2,3,4 PL8-CN1,PL8-CN2,PL8-CN3,PL8-CN4,PL8-CN5
        mgr active: PL8-CN1
     osdmap e243: 120 osds: 96 up, 120 in; 2048 remapped pgs
            flags sortbitwise,require_jewel_osds,require_kraken_osds
      pgmap v1099: 2048 pgs, 1 pools, 84304 MB data, 310 kobjects
            105 GB used, 436 TB / 436 TB avail
            1091151/1592070 objects degraded (68.537%)
                2048 active+undersized+degraded
  client io 107 MB/s wr, 0 op/s rd, 860 op/s wr

After reboot I can see that all the 24 OSD's in the first node showing down state. I can see the 24 osd process is running. 

#ps -ef | grep -c  ceph-osd
24

Even If i tried parallely on 5 nodes this procedure  and hit a reboot then it will come successfully without any issues, but for parallel execution time, I would require downtime, which is not accepted by our management at the moment. Please help and share your views. 

I read this https://ceph.com/releases/v12-2-0-luminous-released/  upgrade section. but this didn't help me at the moment. 

Here my question what is the best method to update machine without any downtime? 

Thanks

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com