Mismatching nonce for 'ceph osd.0 tell'

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi

When running  cephtool-test-mon.sh, part of it executes:
  ceph tell osd.0 version
I see reports on the commandline, I guess that this is the OSD
complaining that things are wrong:

2016-09-12 23:50:39.239037 814e50e00  0 -- 127.0.0.1:0/1925715881 >>
127.0.0.1:6800/26384 conn(0x814fde800 sd=18 :-1
s=STATE_CONNECTING_WAIT_BANNER_AND_IDENTIFY pgs=0 cs=0
l=1)._process_connection connect claims to be 127.0.0.1:6800/1026384 not
127.0.0.1:6800/26384 - wrong node!

Which it will run until it is shot down.... after 3600 secs.

the nonce is incremented with 1000000 on every rebind.

But what I do not understand is how this mismatch has occurred.
I would expect port 6800 to be the port on which the OSD is connected
too, so the connecting party (ceph in this case) thinks the nonce to be
1026384. Did the MON have this information? And where did the MON then
get it from....

Somewhere one of the parts did not receive the new nonce, or did not
also increment it?

Any suggestions welcomed on directions where to look,

Thanx,
--WjW
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux