Issue adding mon after upgrade to 15.2.2

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



In our lab-setup, I'm simulating the future migration of centos 7 + ceph
14.2.x to cepntos 8 + ceph 15.2.x.
At the moment, I upgraded one of the nodes, which is a combined
mon+mgr+mds+osd, to el8 + 15.2.2. The other node (also a combined one) is
still on el7 + 14.2.9.
The osd was detected and re-added easily, as were the mgr and mds. The mon
won't add.
I use the following (manual) method:

   1. ceph auth get mon. -o /tmp/kr (exports correctly)
   2. ceph mon getmap -o /tmp/mm (exports correctly)
   3. sudo ceph-mon -i ontw-ceph01 --mkfs --monmap /tmp/mm --keyring
   /tmp/kr (creates file system + rocksdb correctly in
   "/var/lib/ceph/mon/ceph-ontw-ceph01")
   4. systemctl start ceph-mon@ontw-ceph01.service

ceph-mon starts up successfully, but won't enter the cluster.
In the other (14.2.9) ceph-mon log, I see loads of this message (multiple
per :
--------------
2020-05-25 13:07:38.075 7fb38366d700  1 mon.ontw-ceph02@0(leader) e3
 adding peer [v2:192.168.100.60:3300/0,v1:192.168.100.60:6789/0] to list of
hints
2020-05-25 13:07:38.075 7fb38366d700  1 mon.ontw-ceph02@0(leader) e3
 adding peer [v2:192.168.100.60:3300/0,v1:192.168.100.60:6789/0] to list of
hints
2020-05-25 13:07:38.076 7fb38366d700  1 mon.ontw-ceph02@0(leader) e3
 adding peer [v2:192.168.100.60:3300/0,v1:192.168.100.60:6789/0] to list of
hints
2020-05-25 13:07:38.076 7fb38366d700  1 mon.ontw-ceph02@0(leader) e3
 adding peer [v2:192.168.100.60:3300/0,v1:192.168.100.60:6789/0] to list of
hints
2020-05-25 13:07:38.077 7fb38366d700  1 mon.ontw-ceph02@0(leader) e3
 adding peer [v2:192.168.100.60:3300/0,v1:192.168.100.60:6789/0] to list of
hints
etc...
--------------
The 3300 and 6789 ports are open and reacheable on both nodes and they can
reach and connect to each other.
192.168.100.60 = ontw-ceph01, upgraded 15.2.2-node
192.168.100.61 = ontw-ceph02, active 14.2.9-node

When I start ceph-mon in debug mode @ ceph01, I see the following happening:
--------------
2020-05-25T13:07:35.031+0200 7f712e1d46c0  0 starting mon.ontw-ceph01 rank
-1 at public addrs [v2:192.168.100.60:3300/0,v1:192.168.100.60:6789/0] at
bind addrs [v2:192.168.100.60:3300/0,v1:192.168.100.60:6789/0] mon_data
/var/lib/ceph/mon/ceph-ontw-ceph01 fsid f3c3f099-2940-4074-a7fe-1aea6259f67b
2020-05-25T13:07:35.033+0200 7f712e1d46c0  1 mon.ontw-ceph01@-1(???) e1
preinit fsid f3c3f099-2940-4074-a7fe-1aea6259f67b
2020-05-25T13:07:35.033+0200 7f712e1d46c0 10 mon.ontw-ceph01@-1(???) e1
check_fsid cluster_uuid contains 'f3c3f099-2940-4074-a7fe-1aea6259f67b'
2020-05-25T13:07:35.033+0200 7f712e1d46c0 10 mon.ontw-ceph01@-1(???) e1
features compat={},rocompat={},incompat={1=initial feature set
(~v.18),3=single paxos with k/v store (v0.?)}
2020-05-25T13:07:35.033+0200 7f712e1d46c0 10 mon.ontw-ceph01@-1(???) e1
calc_quorum_requirements required_features 2449958197560098820
2020-05-25T13:07:35.033+0200 7f712e1d46c0 10 mon.ontw-ceph01@-1(???) e1
required_features 2449958197560098820
2020-05-25T13:07:35.033+0200 7f712e1d46c0 10 mon.ontw-ceph01@-1(???) e1
has_ever_joined = 0
2020-05-25T13:07:35.033+0200 7f712e1d46c0  1 mon.ontw-ceph01@-1(???) e1
 initial_members ontw-ceph02,ontw-ceph01, filtering seed monmap
2020-05-25T13:07:35.033+0200 7f712e1d46c0 10 mon.ontw-ceph01@-1(???) e1
 monmap is e1: 2 mons at {ontw-ceph01=,ontw-ceph02=[v2:
192.168.100.61:3300/0,v1:192.168.100.61:6789/0]}
2020-05-25T13:07:35.033+0200 7f712e1d46c0 10 mon.ontw-ceph01@-1(???) e1
 extra probe peers
2020-05-25T13:07:35.033+0200 7f712e1d46c0 10 mon.ontw-ceph01@-1(???) e1
sync_last_committed_floor 0
2020-05-25T13:07:35.033+0200 7f712e1d46c0 10 mon.ontw-ceph01@-1(???) e1
init_paxos
2020-05-25T13:07:35.033+0200 7f712e1d46c0  5 mon.ontw-ceph01@-1(???).mds e0
Unable to load 'last_metadata'
2020-05-25T13:07:35.033+0200 7f712e1d46c0 10 mon.ontw-ceph01@-1(???).health
init
2020-05-25T13:07:35.033+0200 7f712e1d46c0 10 mon.ontw-ceph01@-1(???).config
init
2020-05-25T13:07:35.033+0200 7f712e1d46c0 10 mon.ontw-ceph01@-1(???) e1
refresh_from_paxos
2020-05-25T13:07:35.033+0200 7f712e1d46c0 10 mon.ontw-ceph01@-1(???) e1
refresh_from_paxos no cluster_fingerprint
2020-05-25T13:07:35.033+0200 7f712e1d46c0 10 mon.ontw-ceph01@-1(???).log v0
update_from_paxos
2020-05-25T13:07:35.034+0200 7f712e1d46c0 10 mon.ontw-ceph01@-1(???).log v0
update_from_paxos version 0 summary v 0
2020-05-25T13:07:35.034+0200 7f712e1d46c0 10 mon.ontw-ceph01@-1(???).auth
v0 update_from_paxos
2020-05-25T13:07:35.034+0200 7f712e1d46c0 10 mon.ontw-ceph01@-1(???).config
load_config got 0 keys
2020-05-25T13:07:35.034+0200 7f712e1d46c0 10 mon.ontw-ceph01@-1(???).mgrstat
 0
2020-05-25T13:07:35.034+0200 7f712e1d46c0 10 mon.ontw-ceph01@-1(???).mgrstat
check_subs
2020-05-25T13:07:35.035+0200 7f712e1d46c0 10 mon.ontw-ceph01@-1(???).health
update_from_paxos
2020-05-25T13:07:35.035+0200 7f712e1d46c0 10 mon.ontw-ceph01@-1(???) e1
loading initial keyring to bootstrap authentication for mkfs
2020-05-25T13:07:35.035+0200 7f712e1d46c0  2 mon.ontw-ceph01@-1(???) e1 init
2020-05-25T13:07:35.035+0200 7f712e1d46c0 10 mon.ontw-ceph01@-1(???).mgr e0
prime_mgr_client
2020-05-25T13:07:35.035+0200 7f712e1d46c0 10 mon.ontw-ceph01@-1(probing) e1
bootstrap
2020-05-25T13:07:35.035+0200 7f712e1d46c0 10 mon.ontw-ceph01@-1(probing) e1
sync_reset_requester
2020-05-25T13:07:35.035+0200 7f712e1d46c0 10 mon.ontw-ceph01@-1(probing) e1
unregister_cluster_logger - not registered
2020-05-25T13:07:35.035+0200 7f712e1d46c0 10 mon.ontw-ceph01@-1(probing) e1
cancel_probe_timeout (none scheduled)
2020-05-25T13:07:35.035+0200 7f712e1d46c0 10 mon.ontw-ceph01@-1(probing) e1
monmap e1: 2 mons at {ontw-ceph01=,ontw-ceph02=[v2:
192.168.100.61:3300/0,v1:192.168.100.61:6789/0]}
2020-05-25T13:07:35.035+0200 7f712e1d46c0 10 mon.ontw-ceph01@-1(probing) e1
_reset
2020-05-25T13:07:35.035+0200 7f712e1d46c0 10 mon.ontw-ceph01@-1(probing).auth
v0 _set_mon_num_rank num 0 rank 0
2020-05-25T13:07:35.035+0200 7f712e1d46c0 10 mon.ontw-ceph01@-1(probing) e1
cancel_probe_timeout (none scheduled)
2020-05-25T13:07:35.035+0200 7f712e1d46c0 10 mon.ontw-ceph01@-1(probing) e1
timecheck_finish
2020-05-25T13:07:35.035+0200 7f712e1d46c0 10 mon.ontw-ceph01@-1(probing) e1
scrub_event_cancel
2020-05-25T13:07:35.035+0200 7f712e1d46c0 10 mon.ontw-ceph01@-1(probing) e1
scrub_reset
2020-05-25T13:07:35.035+0200 7f712e1d46c0 10 mon.ontw-ceph01@-1(probing) e1
cancel_probe_timeout (none scheduled)
2020-05-25T13:07:35.035+0200 7f712e1d46c0 10 mon.ontw-ceph01@-1(probing) e1
reset_probe_timeout 0x563148b1eb60 after 2 seconds
2020-05-25T13:07:35.035+0200 7f712e1d46c0 10 mon.ontw-ceph01@-1(probing) e1
probing other monitors
2020-05-25T13:07:35.035+0200 7f712e1d46c0  0 -- [v2:
192.168.100.60:3300/0,v1:192.168.100.60:6789/0] send_to message
mon_probe(probe f3c3f099-2940-4074-a7fe-1aea6259f67b name ontw-ceph01 new
mon_release octopus) v7 with empty dest
2020-05-25T13:07:35.036+0200 7f711bb0b700 10 mon.ontw-ceph01@-1(probing) e1
get_authorizer for mon
2020-05-25T13:07:35.238+0200 7f711bb0b700 10 mon.ontw-ceph01@-1(probing) e1
get_authorizer for mon
2020-05-25T13:07:35.640+0200 7f711bb0b700 10 mon.ontw-ceph01@-1(probing) e1
get_authorizer for mon
2020-05-25T13:07:36.442+0200 7f711bb0b700 10 mon.ontw-ceph01@-1(probing) e1
get_authorizer for mon
2020-05-25T13:07:37.035+0200 7f711a308700  4 mon.ontw-ceph01@-1(probing) e1
probe_timeout 0x563148b1eb60
2020-05-25T13:07:37.035+0200 7f711a308700 10 mon.ontw-ceph01@-1(probing) e1
bootstrap
2020-05-25T13:07:37.035+0200 7f711a308700 10 mon.ontw-ceph01@-1(probing) e1
sync_reset_requester
2020-05-25T13:07:37.035+0200 7f711a308700 10 mon.ontw-ceph01@-1(probing) e1
unregister_cluster_logger - not registered
2020-05-25T13:07:37.035+0200 7f711a308700 10 mon.ontw-ceph01@-1(probing) e1
cancel_probe_timeout (none scheduled)
2020-05-25T13:07:37.035+0200 7f711a308700 10 mon.ontw-ceph01@-1(probing) e1
monmap e1: 2 mons at {ontw-ceph01=,ontw-ceph02=[v2:
192.168.100.61:3300/0,v1:192.168.100.61:6789/0]}
2020-05-25T13:07:37.035+0200 7f711a308700 10 mon.ontw-ceph01@-1(probing) e1
_reset
2020-05-25T13:07:37.035+0200 7f711a308700 10 mon.ontw-ceph01@-1(probing).auth
v0 _set_mon_num_rank num 0 rank 0
2020-05-25T13:07:37.035+0200 7f711a308700 10 mon.ontw-ceph01@-1(probing) e1
cancel_probe_timeout (none scheduled)
2020-05-25T13:07:37.035+0200 7f711a308700 10 mon.ontw-ceph01@-1(probing) e1
timecheck_finish
2020-05-25T13:07:37.035+0200 7f711a308700 10 mon.ontw-ceph01@-1(probing) e1
scrub_event_cancel
2020-05-25T13:07:37.035+0200 7f711a308700 10 mon.ontw-ceph01@-1(probing) e1
scrub_reset
2020-05-25T13:07:37.035+0200 7f711a308700 10 mon.ontw-ceph01@-1(probing) e1
cancel_probe_timeout (none scheduled)
2020-05-25T13:07:37.035+0200 7f711a308700 10 mon.ontw-ceph01@-1(probing) e1
reset_probe_timeout 0x563148b1f7c0 after 2 seconds
2020-05-25T13:07:37.035+0200 7f711a308700 10 mon.ontw-ceph01@-1(probing) e1
probing other monitors
2020-05-25T13:07:37.035+0200 7f711a308700  0 -- [v2:
192.168.100.60:3300/0,v1:192.168.100.60:6789/0] send_to message
mon_probe(probe f3c3f099-2940-4074-a7fe-1aea6259f67b name ontw-ceph01 new
mon_release octopus) v7 with empty dest
2020-05-25T13:07:38.045+0200 7f711bb0b700 10 mon.ontw-ceph01@-1(probing) e1
get_authorizer for mon
2020-05-25T13:07:38.047+0200 7f7117b03700 10 mon.ontw-ceph01@-1(probing) e1
_ms_dispatch new session 0x563149893680 MonSession(mon.0 [v2:
192.168.100.61:3300/0,v1:192.168.100.61:6789/0] is open , features
0x3f01cfb8ffacffff (luminous)) features 0x3f01cfb8ffacffff
2020-05-25T13:07:38.047+0200 7f7117b03700  5 mon.ontw-ceph01@-1(probing) e1
_ms_dispatch setting monitor caps on this connection
2020-05-25T13:07:38.047+0200 7f7117b03700 10 mon.ontw-ceph01@-1(probing) e1
handle_probe mon_probe(reply f3c3f099-2940-4074-a7fe-1aea6259f67b name
ontw-ceph02 quorum 0 paxos( fc 304966 lc 305658 ) mon_release nautilus) v7
2020-05-25T13:07:38.047+0200 7f7117b03700 10 mon.ontw-ceph01@-1(probing) e1
handle_probe_reply mon.0 v2:192.168.100.61:3300/0 mon_probe(reply
f3c3f099-2940-4074-a7fe-1aea6259f67b name ontw-ceph02 quorum 0 paxos( fc
304966 lc 305658 ) mon_release nautilus) v7
2020-05-25T13:07:38.047+0200 7f7117b03700 10 mon.ontw-ceph01@-1(probing) e1
 monmap is e1: 2 mons at {ontw-ceph01=,ontw-ceph02=[v2:
192.168.100.61:3300/0,v1:192.168.100.61:6789/0]}
2020-05-25T13:07:38.047+0200 7f7117b03700 10 mon.ontw-ceph01@-1(probing) e1
 got newer/committed monmap epoch 3, mine was 1
2020-05-25T13:07:38.047+0200 7f7117b03700 10 mon.ontw-ceph01@-1(probing) e3
bootstrap
2020-05-25T13:07:38.047+0200 7f7117b03700 10 mon.ontw-ceph01@-1(probing) e3
sync_reset_requester
2020-05-25T13:07:38.047+0200 7f7117b03700 10 mon.ontw-ceph01@-1(probing) e3
unregister_cluster_logger - not registered
2020-05-25T13:07:38.047+0200 7f7117b03700 10 mon.ontw-ceph01@-1(probing) e3
cancel_probe_timeout 0x563148b1f7c0
2020-05-25T13:07:38.047+0200 7f7117b03700 10 mon.ontw-ceph01@-1(probing) e3
monmap e3: 1 mons at {ontw-ceph02=[v2:
192.168.100.61:3300/0,v1:192.168.100.61:6789/0]}
2020-05-25T13:07:38.047+0200 7f7117b03700 10 mon.ontw-ceph01@-1(probing) e3
_reset
2020-05-25T13:07:38.047+0200 7f7117b03700 10 mon.ontw-ceph01@-1(probing).auth
v0 _set_mon_num_rank num 0 rank 0
2020-05-25T13:07:38.047+0200 7f7117b03700 10 mon.ontw-ceph01@-1(probing) e3
cancel_probe_timeout (none scheduled)
2020-05-25T13:07:38.047+0200 7f7117b03700 10 mon.ontw-ceph01@-1(probing) e3
timecheck_finish
2020-05-25T13:07:38.047+0200 7f7117b03700 10 mon.ontw-ceph01@-1(probing) e3
scrub_event_cancel
2020-05-25T13:07:38.047+0200 7f7117b03700 10 mon.ontw-ceph01@-1(probing) e3
scrub_reset
2020-05-25T13:07:38.047+0200 7f7117b03700 10 mon.ontw-ceph01@-1(probing) e3
cancel_probe_timeout (none scheduled)
2020-05-25T13:07:38.047+0200 7f7117b03700 10 mon.ontw-ceph01@-1(probing) e3
reset_probe_timeout 0x563148b1f7c0 after 2 seconds
2020-05-25T13:07:38.047+0200 7f7117b03700 10 mon.ontw-ceph01@-1(probing) e3
probing other monitors
2020-05-25T13:07:38.047+0200 7f7117b03700 10 mon.ontw-ceph01@-1(probing) e3
handle_probe mon_probe(reply f3c3f099-2940-4074-a7fe-1aea6259f67b name
ontw-ceph02 quorum 0 paxos( fc 304966 lc 305658 ) mon_release nautilus) v7
2020-05-25T13:07:38.047+0200 7f7117b03700 10 mon.ontw-ceph01@-1(probing) e3
handle_probe_reply mon.0 v2:192.168.100.61:3300/0 mon_probe(reply
f3c3f099-2940-4074-a7fe-1aea6259f67b name ontw-ceph02 quorum 0 paxos( fc
304966 lc 305658 ) mon_release nautilus) v7
2020-05-25T13:07:38.047+0200 7f7117b03700 10 mon.ontw-ceph01@-1(probing) e3
 monmap is e3: 1 mons at {ontw-ceph02=[v2:
192.168.100.61:3300/0,v1:192.168.100.61:6789/0]}
2020-05-25T13:07:38.047+0200 7f7117b03700 10 mon.ontw-ceph01@-1(probing) e3
 got newer/committed monmap epoch 3, mine was 3
--------------

I'm a bit lost at what's (not) happening. Hope you can help!
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux