Hello, Could someone please help me complete my botched upgrade from Jewel 10.2.3-r1 to Luminous 12.2.1. I have 9 Gentoo servers, 4 of which have 2 OSDs each. My OSD servers were accidentally rebooted before the monitor servers causing them to be running Luminous before the monitors. All services have been restarted and running ceph versions gives the following: # ceph versions 2017-11-27 21:27:24.356940 7fed67efe700 -1 WARNING: the following dangerous and experimental features are enabled: btrfs 2017-11-27 21:27:24.368469 7fed67efe700 -1 WARNING: the following dangerous and experimental features are enabled: btrfs "mon": { "ceph version 12.2.1 (3e7492b9ada8bdc9a5cd0feafd42fbca27f9c38e) luminous (stable)": 4 }, "mgr": { "ceph version 12.2.1 (3e7492b9ada8bdc9a5cd0feafd42fbca27f9c38e) luminous (stable)": 3 }, "osd": {}, "mds": { "ceph version 12.2.1 (3e7492b9ada8bdc9a5cd0feafd42fbca27f9c38e) luminous (stable)": 1 }, "overall": { "ceph version 12.2.1 (3e7492b9ada8bdc9a5cd0feafd42fbca27f9c38e) luminous (stable)": 8 For some reason the OSDs do not show what version they are running, and a ceph osd tree shows all of the OSD as being down. # ceph osd tree 2017-11-27 21:32:51.969335 7f483d9c2700 -1 WARNING: the following dangerous and experimental features are enabled: btrfs 2017-11-27 21:32:51.980976 7f483d9c2700 -1 WARNING: the following dangerous and experimental features are enabled: btrfs ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF -1 27.77998 root default -3 27.77998 datacenter DC1 -6 27.77998 rack 1B06 -5 6.48000 host ceph3 1 1.84000 osd.1 down 0 1.00000 3 4.64000 osd.3 down 0 1.00000 -2 5.53999 host ceph4 5 4.64000 osd.5 down 0 1.00000 8 0.89999 osd.8 down 0 1.00000 -4 9.28000 host ceph6 0 4.64000 osd.0 down 0 1.00000 2 4.64000 osd.2 down 0 1.00000 -7 6.48000 host ceph7 6 4.64000 osd.6 down 0 1.00000 7 1.84000 osd.7 down 0 1.00000 The OSD logs all have this message: 20235 osdmap REQUIRE_JEWEL OSDMap flag is NOT set; please set it. When I try to set it with "ceph osd set require_jewel_osds" I get this error: Error EPERM: not all up OSDs have CEPH_FEATURE_SERVER_JEWEL feature A "ceph features" returns: "mon": { "group": { "features": "0x1ffddff8eea4fffb", "release": "luminous", "num": 4 } }, "mds": { "group": { "features": "0x1ffddff8eea4fffb", "release": "luminous", "num": 1 } }, "osd": { "group": { "features": "0x1ffddff8eea4fffb", "release": "luminous", "num": 8 } }, "client": { "group": { "features": "0x1ffddff8eea4fffb", "release": "luminous", "num": 3 # ceph tell osd.* versions 2017-11-28 02:29:28.565943 7f99c6aee700 -1 WARNING: the following dangerous and experimental features are enabled: btrfs 2017-11-28 02:29:28.578956 7f99c6aee700 -1 WARNING: the following dangerous and experimental features are enabled: btrfs Error ENXIO: problem getting command descriptions from osd.0 osd.0: problem getting command descriptions from osd.0 Error ENXIO: problem getting command descriptions from osd.1 osd.1: problem getting command descriptions from osd.1 Error ENXIO: problem getting command descriptions from osd.2 osd.2: problem getting command descriptions from osd.2 Error ENXIO: problem getting command descriptions from osd.3 osd.3: problem getting command descriptions from osd.3 Error ENXIO: problem getting command descriptions from osd.5 osd.5: problem getting command descriptions from osd.5 Error ENXIO: problem getting command descriptions from osd.6 osd.6: problem getting command descriptions from osd.6 Error ENXIO: problem getting command descriptions from osd.7 osd.7: problem getting command descriptions from osd.7 Error ENXIO: problem getting command descriptions from osd.8 osd.8: problem getting command descriptions from osd.8 # ceph daemon osd.1 status "cluster_fsid": "CENSORED", "osd_fsid": "CENSORED", "whoami": 1, "state": "preboot", "oldest_map": 19482, "newest_map": 20235, "num_pgs": 141 # ceph -s 2017-11-27 22:04:10.372471 7f89a3935700 -1 WARNING: the following dangerous and experimental features are enabled: btrfs 2017-11-27 22:04:10.375709 7f89a3935700 -1 WARNING: the following dangerous and experimental features are enabled: btrfs cluster: id: CENSORED health: HEALTH_ERR 513 pgs are stuck inactive for more than 60 seconds 126 pgs backfill_wait 52 pgs backfilling 435 pgs degraded 513 pgs stale 435 pgs stuck degraded 513 pgs stuck stale 435 pgs stuck unclean 435 pgs stuck undersized 435 pgs undersized recovery 854719/3688140 objects degraded (23.175%) recovery 838607/3688140 objects misplaced (22.738%) mds cluster is degraded crush map has straw_calc_version=0 services: mon: 4 daemons, quorum 0,1,3,2 mgr: 0(active), standbys: 1, 5 mds: cephfs-1/1/1 up {0=a=up:replay}, 1 up:standby osd: 8 osds: 0 up, 0 in data: pools: 7 pools, 513 pgs objects: 1199k objects, 4510 GB usage: 13669 GB used, 15150 GB / 28876 GB avail pgs: 854719/3688140 objects degraded (23.175%) 838607/3688140 objects misplaced (22.738%) 257 stale+active+undersized+degraded 126 stale+active+undersized+degraded+remapped+backfill_wait 78 stale+active+clean 52 stale+active+undersized+degraded+remapped+backfilling I ran "ceph auth list", and client.admin has the following permissions. auid: 0 caps: [mds] allow caps: [mgr] allow * caps: [mon] allow * caps: [osd] allow * Thank you for your time. Is there any way I can get these OSDs to join the cluster now, or recover my data? Cary -Dynamic -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html