On vrijdag 16 februari 2018 00:21:34 CET Gregory Farnum wrote: > If mon.0 is not connected to the cluster, the monitor version report won’t > update — how could it? > > So you need to figure out why that’s not working. A monitor that’s running > but isn’t part of the active set is not good.
Obviously, I check the versions after the monitor is restarted. So the steps I followed are:
- `ceph tell mon.* version` - Stop mon.0 - Verify mon.0 is actually stopped using `ps uaxwww` and `ceph -s` on other monitors - Start mon.0 - `ceph tell mon.* version`
It's not too weird that various versions of OSD's are running. The box has a high uptime and has been upgraded in between of adding new OSD's. The md5sum of /usr/bin/ceph-mon, which is obviously the daemon that is running, is identical on all machines.
See the attachment for logging on the moment of restarting.
-- Kerio Operator in de Cloud? https://www.kerioindecloud.nl/ Mark Schouten | Tuxis Internet Engineering KvK: 61527076 | http://www.tuxis.nl/ T: 0318 200208 | info@xxxxxxxx |
2018-02-14 03:45:02.142186 7ff8d3348700 0 quorum service shutdown 2018-02-14 03:45:38.928910 7fada75a8880 0 ceph version 0.94.10 (b1e0532418e4631af01acbc0cedd426f1905f4af), process ceph-mon, pid 55243 2018-02-14 03:45:38.979708 7fada75a8880 0 starting mon.0 rank 1 at 192.168.100.2:6789/0 mon_data /var/lib/ceph/mon/ceph-0 fsid 36dab522-b24d-44db-8df8-e7aa6b901f66 2018-02-14 03:45:38.979958 7fada75a8880 1 mon.0@-1(probing) e3 preinit fsid 36dab522-b24d-44db-8df8-e7aa6b901f66 2018-02-14 03:45:38.980313 7fada75a8880 1 mon.0@-1(probing).paxosservice(pgmap 59142586..59143174) refresh upgraded, format 0 -> 1 2018-02-14 03:45:38.980321 7fada75a8880 1 mon.0@-1(probing).pg v0 on_upgrade discarding in-core PGMap 2018-02-14 03:45:38.985070 7fada75a8880 0 mon.0@-1(probing).mds e1 print_map epoch 1 flags 0 created 0.000000 modified 2016-01-20 11:09:14.572361 tableserver 0 root 0 session_timeout 0 session_autoclose 0 max_file_size 0 last_failure 0 last_failure_osd_epoch 0 compat compat={},rocompat={},incompat={} max_mds 0 in up {} failed stopped data_pools metadata_pool 0 inline_data disabled 2018-02-14 03:45:38.985331 7fada75a8880 0 mon.0@-1(probing).osd e4020 crush map has features 1107558400, adjusting msgr requires 2018-02-14 03:45:38.985339 7fada75a8880 0 mon.0@-1(probing).osd e4020 crush map has features 1107558400, adjusting msgr requires 2018-02-14 03:45:38.985342 7fada75a8880 0 mon.0@-1(probing).osd e4020 crush map has features 1107558400, adjusting msgr requires 2018-02-14 03:45:38.985344 7fada75a8880 0 mon.0@-1(probing).osd e4020 crush map has features 1107558400, adjusting msgr requires 2018-02-14 03:45:38.985731 7fada75a8880 1 mon.0@-1(probing).paxosservice(auth 19251..19482) refresh upgraded, format 0 -> 1 2018-02-14 03:45:38.986735 7fada75a8880 0 mon.0@-1(probing) e3 my rank is now 1 (was -1) 2018-02-14 03:45:38.988799 7fad9ce13700 0 -- 192.168.100.2:6789/0 >> 192.168.100.3:6789/0 pipe(0x3d26000 sd=22 :44448 s=2 pgs=776395 cs=1 l=0 c=0x3957a20).reader missed message? skipped from seq 0 to 1886081844 2018-02-14 03:45:38.989156 7fad9cd12700 0 -- 192.168.100.2:6789/0 >> 192.168.100.1:6789/0 pipe(0x4132000 sd=17 :56104 s=2 pgs=18652390 cs=1 l=0 c=0x3957760).reader missed message? skipped from seq 0 to 1847147828 2018-02-14 03:45:38.992068 7fada28a1700 0 log_channel(audit) log [DBG] : from='admin socket' entity='admin socket' cmd='mon_status' args=[]: dispatch 2018-02-14 03:45:38.992152 7fada28a1700 0 log_channel(audit) log [DBG] : from='admin socket' entity='admin socket' cmd=mon_status args=[]: finished
Attachment:
signature.asc
Description: This is a digitally signed message part.
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com