Hi, On Thu, Dec 9, 2021 at 6:44 PM Chris Palmer <chris.palmer@xxxxxxxxx> wrote: > > Hi Dan & Patrick > > Setting that to true using "ceph config" didn't seem to work. I then > deleted it from there and set it in ceph.conf on node1 and eventually > after a reboot it started ok. I don't know for sure whether it failing > using ceph config was real or just a symptom of something else. > > I'll do the same (using ceph.conf) on the other nodes now. Indeed, for a mon that is already asserting, you have confirmed that it needs to be set in ceph.conf (otherwise it asserts before reading the config map). The other approach -- ceph config set mon ... --- should still work in general, provided it is done before the upgrade begins. You can see how cephadm does this here: https://github.com/ceph/ceph/commit/753fd2fb32196d17e186152e7deaef1e0558b781 > Btw, I can't actually see any release notes other than the highlights in > the earlier posting (and 16.2.7 doesn't show up on the web site list of > releases yet). Is there anything else that I would need to know? The Release Notes PR is here: https://github.com/ceph/ceph/pull/44131 See my comment at the bottom. Thanks for catching this! Cheers, Dan > > Thanks for your very fast responses! > Chris > > On 09/12/2021 17:10, Dan van der Ster wrote: > > On Thu, Dec 9, 2021 at 5:40 PM Patrick Donnelly <pdonnell@xxxxxxxxxx> wrote: > >> Hi Chris, > >> > >> On Thu, Dec 9, 2021 at 10:40 AM Chris Palmer <chris.palmer@xxxxxxxxx> wrote: > >>> Hi > >>> > >>> I've just started an upgrade of a test cluster from 16.2.6 -> 16.2.7 and > >>> immediately hit a problem. > >>> > >>> The cluster started as octopus, and has upgraded through to 16.2.6 > >>> without any trouble. It is a conventional deployment on Debian 10, NOT > >>> using cephadm. All was clean before the upgrade. It contains nodes as > >>> follows: > >>> - Node 1: MON, MGR, MDS, RGW > >>> - Node 2: MON, MGR, MDS, RGW > >>> - Node 3: MON > >>> - Node 4-6: OSDs > >>> > >>> In the absence of any specific upgrade instructions for 16.2.7, I > >>> upgraded Node 1 and rebooted. The MON on that host will now not start, > >>> throwing the following assertion: > >>> > >>> 2021-12-09T14:56:40.098+00:00 xxxxtstmon01 ceph-mon[960]: /build/ceph-16.2.7/src/mds/FSMap.cc: In function 'void FSMap::sanity(bool) const' thread 7f2d309085c0 time 2021-12-09T14:56:40.098395+0000 > >>> 2021-12-09T14:56:40.098+00:00 xxxxtstmon01 ceph-mon[960]: /build/ceph-16.2.7/src/mds/FSMap.cc: 868: FAILED ceph_assert(info.compat.writeable(fs->mds_map.compat)) > >>> 2021-12-09T14:56:40.103+00:00 xxxxtstmon01 ceph-mon[960]: ceph version 16.2.7 (dd0603118f56ab514f133c8d2e3adfc983942503) pacific (stable) > >>> 2021-12-09T14:56:40.103+00:00 xxxxtstmon01 ceph-mon[960]: 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x14b) [0x7f2d3222423c] > >>> 2021-12-09T14:56:40.103+00:00 xxxxtstmon01 ceph-mon[960]: 2: /usr/lib/ceph/libceph-common.so.2(+0x277414) [0x7f2d32224414] > >>> 2021-12-09T14:56:40.103+00:00 xxxxtstmon01 ceph-mon[960]: 3: (FSMap::sanity(bool) const+0x2a8) [0x7f2d327331c8] > >>> 2021-12-09T14:56:40.103+00:00 xxxxtstmon01 ceph-mon[960]: 4: (MDSMonitor::update_from_paxos(bool*)+0x396) [0x55a32fe6b546] > >>> 2021-12-09T14:56:40.103+00:00 xxxxtstmon01 ceph-mon[960]: 5: (PaxosService::refresh(bool*)+0x10a) [0x55a32fd960ca] > >>> 2021-12-09T14:56:40.103+00:00 xxxxtstmon01 ceph-mon[960]: 6: (Monitor::refresh_from_paxos(bool*)+0x17c) [0x55a32fc54bec] > >>> 2021-12-09T14:56:40.103+00:00 xxxxtstmon01 ceph-mon[960]: 7: (Monitor::init_paxos()+0xfc) [0x55a32fc54e9c] > >>> 2021-12-09T14:56:40.103+00:00 xxxxtstmon01 ceph-mon[960]: 8: (Monitor::preinit()+0xbb9) [0x55a32fc7eb09] > >>> 2021-12-09T14:56:40.103+00:00 xxxxtstmon01 ceph-mon[960]: 9: main() > >>> 2021-12-09T14:56:40.103+00:00 xxxxtstmon01 ceph-mon[960]: 10: __libc_start_main() > >>> 2021-12-09T14:56:40.103+00:00 xxxxtstmon01 ceph-mon[960]: 11: _start() > >>> > >>> ceph health detail merely shows mon01 down, and the 5 crashes before the service stopped auto-restarting. > >> Please disable mon_mds_skip_sanity in the mons ceph.conf: > >> > >> [mon] > >> mon_mds_skip_sanity = false > > Oops, I think you meant mon_mds_skip_sanity = true > > > > Chris does that allow that mon to startup? > > > > -- dan > > > > > > > >> The cephadm upgrade sequence is already doing this but I forgot > >> (sorry!) to mention this is required for manual upgrades in the > >> release notes. > >> > >> Please re-enable after the upgrade completes and the cluster is stable. > >> > >> -- > >> Patrick Donnelly, Ph.D. > >> He / Him / His > >> Principal Software Engineer > >> Red Hat, Inc. > >> GPG: 19F28A586F808C2402351B93C3301A3E258DD79D > >> > >> _______________________________________________ > >> ceph-users mailing list -- ceph-users@xxxxxxx > >> To unsubscribe send an email to ceph-users-leave@xxxxxxx > _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx