Thanks Grge, we used test on the old version, and have plan to update already...about the assert, i puzzled how could the session_key could become empty... the connection had just finish `verify_authorizer` , in which we(accepter) got the session_key from peer's authorizer_bl and encrypt auth_reply with the session_key... then how could the session_key missing when we `check_message_signature` on read? 2017-09-06 1:55 GMT+08:00 Gregory Farnum <gfarnum@xxxxxxxxxx>: > On Tue, Sep 5, 2017 at 7:19 AM, zengran zhang <z13121369189@xxxxxxxxx> wrote: >> Hi, >> >> in our cluster, we could found some rare and odd assert log, which >> says that the session key was emtpy. the log happen only once dur some >> days, and only happen when osd just boot up... >> is it possible that the `CephxAuthorizeHandler::verify_authorizer` >> return valid but the `session_key` was null ? log as follow: >> >> 2017-09-05 10:11:00.954356 7f898c43d700 20 -- 192.168.2.1:6805/7391 >> >> 192.168.2.1:6811/7386 conn(0x55c5053b1800 :6805 >> s=STATE_ACCEPTING_WAIT_CONNECT_MSG pgs=0 cs=0 l=0).process prev state >> is STATE_ACCEPTING_WAIT_BANNER_ADDR >> 2017-09-05 10:11:00.954361 7f898bc3c700 10 Event(0x55c4fff222c0 >> nevent=5000 time_id=1).process_events wait second 30 usec 0 >> 2017-09-05 10:11:00.954363 7f898c43d700 25 -- 192.168.2.1:6805/7391 >> >> 192.168.2.1:6811/7386 conn(0x55c5053b1800 :6805 >> s=STATE_ACCEPTING_WAIT_CONNECT_MSG pgs=0 cs=0 l=0).read_until len is >> 33 state_offset is 0 >> 2017-09-05 10:11:00.954372 7f898c43d700 25 -- 192.168.2.1:6805/7391 >> >> 192.168.2.1:6811/7386 conn(0x55c5053b1800 :6805 >> s=STATE_ACCEPTING_WAIT_CONNECT_MSG pgs=0 cs=0 l=0).read_until >> read_bulk recv_end is 0 left is 33 got 207 >> 2017-09-05 10:11:00.954466 7f898c43d700 20 -- 192.168.2.1:6805/7391 >> >> 192.168.2.1:6811/7386 conn(0x55c5053b1800 :6805 >> s=STATE_ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=0).process prev >> state is STATE_ACCEPTING_WAIT_CONNECT_MSG >> 2017-09-05 10:11:00.954488 7f898c43d700 25 -- 192.168.2.1:6805/7391 >> >> 192.168.2.1:6811/7386 conn(0x55c5053b1800 :6805 >> s=STATE_ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=0).read_until len >> is 174 state_offset is 0 >> 2017-09-05 10:11:00.954495 7f898c43d700 25 -- 192.168.2.1:6805/7391 >> >> 192.168.2.1:6811/7386 conn(0x55c5053b1800 :6805 >> s=STATE_ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=0).read_until got >> 174 in buffer left is 0 buffer still has 0 >> 2017-09-05 10:11:00.954502 7f898c43d700 20 -- 192.168.2.1:6805/7391 >> >> 192.168.2.1:6811/7386 conn(0x55c5053b1800 :6805 >> s=STATE_ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 >> l=0)._process_connection accept got peer connect_seq 0 global_seq 7 >> 2017-09-05 10:11:00.954509 7f898c43d700 10 -- 192.168.2.1:6805/7391 >> >> 192.168.2.1:6811/7386 conn(0x55c5053b1800 :6805 >> s=STATE_ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 >> l=1)._process_connection accept of host_type 4, policy.lossy=1 >> policy.server=1 policy.standby=0 policy.resetcheck=0 >> 2017-09-05 10:11:00.954525 7f898c43d700 10 -- 192.168.2.1:6805/7391 >> >> 192.168.2.1:6811/7386 conn(0x55c5053b1800 :6805 >> s=STATE_ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 >> l=1).handle_connect_msg accept my proto 10, their proto 10 >> 2017-09-05 10:11:00.954535 7f898c43d700 10 -- 192.168.2.1:6805/7391 >> >> 192.168.2.1:6811/7386 conn(0x55c5053b1800 :6805 >> s=STATE_ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 >> l=1).handle_connect_msg accept setting up session_security. >> 2017-09-05 10:11:00.954542 7f898c43d700 10 -- 192.168.2.1:6805/7391 >> >> 192.168.2.1:6811/7386 conn(0x55c5053b1800 :6805 >> s=STATE_ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 >> l=1).handle_connect_msg accept new session >> 2017-09-05 10:11:00.954547 7f898c43d700 10 -- 192.168.2.1:6805/7391 >> >> 192.168.2.1:6811/7386 conn(0x55c5053b1800 :6805 >> s=STATE_ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=7 cs=1 >> l=1).handle_connect_msg accept success, connect_seq = 1 in_seq=0, >> sending READY >> 2017-09-05 10:11:00.954572 7f898c43d700 10 -- 192.168.2.1:6805/7391 >> >> 192.168.2.1:6811/7386 conn(0x55c5053b1800 :6805 >> s=STATE_ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=7 cs=1 >> l=1).handle_connect_msg accept features 1152323339925389307 >> 2017-09-05 10:11:00.954632 7f898c43d700 10 -- 192.168.2.1:6805/7391 >> >> 192.168.2.1:6811/7386 conn(0x55c5053b1800 :6805 >> s=STATE_ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=7 cs=1 l=1)._try_send sent >> bytes 34 remaining bytes 0 >> 2017-09-05 10:11:00.954667 7f898c43d700 2 -- 192.168.2.1:6805/7391 >> >> 192.168.2.1:6811/7386 conn(0x55c5053b1800 :6805 >> s=STATE_ACCEPTING_WAIT_SEQ pgs=7 cs=1 l=1).handle_connect_msg accept >> write reply msg done >> 2017-09-05 10:11:00.954682 7f898c43d700 20 -- 192.168.2.1:6805/7391 >> >> 192.168.2.1:6811/7386 conn(0x55c5053b1800 :6805 >> s=STATE_ACCEPTING_WAIT_SEQ pgs=7 cs=1 l=1).process prev state is >> STATE_ACCEPTING_WAIT_CONNECT_MSG_AUTH >> 2017-09-05 10:11:00.954687 7f898c43d700 25 -- 192.168.2.1:6805/7391 >> >> 192.168.2.1:6811/7386 conn(0x55c5053b1800 :6805 >> s=STATE_ACCEPTING_WAIT_SEQ pgs=7 cs=1 l=1).read_until len is 8 >> state_offset is 0 >> 2017-09-05 10:11:00.954708 7f898c43d700 25 -- 192.168.2.1:6805/7391 >> >> 192.168.2.1:6811/7386 conn(0x55c5053b1800 :6805 >> s=STATE_ACCEPTING_WAIT_SEQ pgs=7 cs=1 l=1).read_until read_bulk >> recv_end is 0 left is 8 got 8 >> 2017-09-05 10:11:00.954724 7f898c43d700 2 -- 192.168.2.1:6805/7391 >> >> 192.168.2.1:6811/7386 conn(0x55c5053b1800 :6805 >> s=STATE_ACCEPTING_WAIT_SEQ pgs=7 cs=1 l=1)._process_connection accept >> get newly_acked_seq 0 >> 2017-09-05 10:11:00.954730 7f898c43d700 10 -- 192.168.2.1:6805/7391 >> >> 192.168.2.1:6811/7386 conn(0x55c5053b1800 :6805 >> s=STATE_ACCEPTING_WAIT_SEQ pgs=7 cs=1 l=1).discard_requeued_up_to 0 >> 2017-09-05 10:11:00.954736 7f898c43d700 20 -- 192.168.2.1:6805/7391 >> >> 192.168.2.1:6811/7386 conn(0x55c5053b1800 :6805 >> s=STATE_ACCEPTING_READY pgs=7 cs=1 l=1).process prev state is >> STATE_ACCEPTING_WAIT_SEQ >> 2017-09-05 10:11:00.954741 7f898c43d700 20 -- 192.168.2.1:6805/7391 >> >> 192.168.2.1:6811/7386 conn(0x55c5053b1800 :6805 >> s=STATE_ACCEPTING_READY pgs=7 cs=1 l=1)._process_connection accept >> done >> 2017-09-05 10:11:00.954746 7f898c43d700 10 Event(0x55c4fff22080 >> nevent=5000 time_id=2).create_time_event id=1 trigger after >> 900000000us >> 2017-09-05 10:11:00.954759 7f898c43d700 20 -- 192.168.2.1:6805/7391 >> >> 192.168.2.1:6811/7386 conn(0x55c5053b1800 :6805 s=STATE_OPEN pgs=7 >> cs=1 l=1).process prev state is STATE_ACCEPTING_READY >> 2017-09-05 10:11:00.954765 7f898c43d700 25 -- 192.168.2.1:6805/7391 >> >> 192.168.2.1:6811/7386 conn(0x55c5053b1800 :6805 s=STATE_OPEN pgs=7 >> cs=1 l=1).read_until len is 1 state_offset is 0 >> 2017-09-05 10:11:00.954772 7f898c43d700 25 -- 192.168.2.1:6805/7391 >> >> 192.168.2.1:6811/7386 conn(0x55c5053b1800 :6805 s=STATE_OPEN pgs=7 >> cs=1 l=1).read_until read_bulk recv_end is 0 left is 1 got 0 >> 2017-09-05 10:11:00.954777 7f898c43d700 25 -- 192.168.2.1:6805/7391 >> >> 192.168.2.1:6811/7386 conn(0x55c5053b1800 :6805 s=STATE_OPEN pgs=7 >> cs=1 l=1).read_until need len 1 remaining 1 bytes >> 2017-09-05 10:11:00.954784 7f898c43d700 30 stack operator() calling >> event process >> 2017-09-05 10:11:00.954786 7f898c43d700 10 Event(0x55c4fff22080 >> nevent=5000 time_id=2).process_events wait second 30 usec 0 >> 2017-09-05 10:11:00.955445 7f898c43d700 20 -- 192.168.2.1:6805/7391 >> >> 192.168.2.1:6811/7386 conn(0x55c5053b1800 :6805 s=STATE_OPEN pgs=7 >> cs=1 l=1).process prev state is STATE_OPEN >> 2017-09-05 10:11:00.955462 7f898c43d700 25 -- 192.168.2.1:6805/7391 >> >> 192.168.2.1:6811/7386 conn(0x55c5053b1800 :6805 s=STATE_OPEN pgs=7 >> cs=1 l=1).read_until len is 1 state_offset is 0 >> 2017-09-05 10:11:00.955472 7f898c43d700 25 -- 192.168.2.1:6805/7391 >> >> 192.168.2.1:6811/7386 conn(0x55c5053b1800 :6805 s=STATE_OPEN pgs=7 >> cs=1 l=1).read_until read_bulk recv_end is 0 left is 1 got 4096 >> 2017-09-05 10:11:00.955478 7f898c43d700 20 -- 192.168.2.1:6805/7391 >> >> 192.168.2.1:6811/7386 conn(0x55c5053b1800 :6805 >> s=STATE_OPEN_MESSAGE_HEADER pgs=7 cs=1 l=1).process prev state is >> STATE_OPEN >> 2017-09-05 10:11:00.955483 7f898c43d700 20 -- 192.168.2.1:6805/7391 >> >> 192.168.2.1:6811/7386 conn(0x55c5053b1800 :6805 >> s=STATE_OPEN_MESSAGE_HEADER pgs=7 cs=1 l=1).process begin MSG >> 2017-09-05 10:11:00.955488 7f898c43d700 25 -- 192.168.2.1:6805/7391 >> >> 192.168.2.1:6811/7386 conn(0x55c5053b1800 :6805 >> s=STATE_OPEN_MESSAGE_HEADER pgs=7 cs=1 l=1).read_until len is 53 >> state_offset is 0 >> 2017-09-05 10:11:00.955494 7f898c43d700 25 -- 192.168.2.1:6805/7391 >> >> 192.168.2.1:6811/7386 conn(0x55c5053b1800 :6805 >> s=STATE_OPEN_MESSAGE_HEADER pgs=7 cs=1 l=1).read_until got 53 in >> buffer left is 0 buffer still has 4042 >> 2017-09-05 10:11:00.955499 7f898c43d700 20 -- 192.168.2.1:6805/7391 >> >> 192.168.2.1:6811/7386 conn(0x55c5053b1800 :6805 >> s=STATE_OPEN_MESSAGE_HEADER pgs=7 cs=1 l=1).process got MSG header >> 2017-09-05 10:11:00.955505 7f898c43d700 20 -- 192.168.2.1:6805/7391 >> >> 192.168.2.1:6811/7386 conn(0x55c5053b1800 :6805 >> s=STATE_OPEN_MESSAGE_HEADER pgs=7 cs=1 l=1).process got envelope >> type=80 src osd.3 front=111623 data=0 off 0 >> 2017-09-05 10:11:00.955522 7f898c43d700 20 -- 192.168.2.1:6805/7391 >> >> 192.168.2.1:6811/7386 conn(0x55c5053b1800 :6805 >> s=STATE_OPEN_MESSAGE_THROTTLE_MESSAGE pgs=7 cs=1 l=1).process prev >> state is STATE_OPEN_MESSAGE_HEADER >> 2017-09-05 10:11:00.955528 7f898c43d700 20 -- 192.168.2.1:6805/7391 >> >> 192.168.2.1:6811/7386 conn(0x55c5053b1800 :6805 >> s=STATE_OPEN_MESSAGE_THROTTLE_BYTES pgs=7 cs=1 l=1).process prev state >> is STATE_OPEN_MESSAGE_THROTTLE_MESSAGE >> 2017-09-05 10:11:00.955533 7f898c43d700 20 -- 192.168.2.1:6805/7391 >> >> 192.168.2.1:6811/7386 conn(0x55c5053b1800 :6805 >> s=STATE_OPEN_MESSAGE_THROTTLE_DISPATCH_QUEUE pgs=7 cs=1 l=1).process >> prev state is STATE_OPEN_MESSAGE_THROTTLE_BYTES >> 2017-09-05 10:11:00.955540 7f898c43d700 20 -- 192.168.2.1:6805/7391 >> >> 192.168.2.1:6811/7386 conn(0x55c5053b1800 :6805 >> s=STATE_OPEN_MESSAGE_READ_FRONT pgs=7 cs=1 l=1).process prev state is >> STATE_OPEN_MESSAGE_THROTTLE_DISPATCH_QUEUE >> 2017-09-05 10:11:00.955552 7f898c43d700 25 -- 192.168.2.1:6805/7391 >> >> 192.168.2.1:6811/7386 conn(0x55c5053b1800 :6805 >> s=STATE_OPEN_MESSAGE_READ_FRONT pgs=7 cs=1 l=1).read_until len is >> 111623 state_offset is 0 >> 2017-09-05 10:11:00.955558 7f898c43d700 25 -- 192.168.2.1:6805/7391 >> >> 192.168.2.1:6811/7386 conn(0x55c5053b1800 :6805 >> s=STATE_OPEN_MESSAGE_READ_FRONT pgs=7 cs=1 l=1).read_until got 4042 in >> buffer left is 107581 buffer still has 0 >> 2017-09-05 10:11:00.955679 7f898c43d700 25 -- 192.168.2.1:6805/7391 >> >> 192.168.2.1:6811/7386 conn(0x55c5053b1800 :6805 >> s=STATE_OPEN_MESSAGE_READ_FRONT pgs=7 cs=1 l=1).read_until read_bulk >> left is 107581 got 107581 >> 2017-09-05 10:11:00.955696 7f898c43d700 20 -- 192.168.2.1:6805/7391 >> >> 192.168.2.1:6811/7386 conn(0x55c5053b1800 :6805 >> s=STATE_OPEN_MESSAGE_READ_FRONT pgs=7 cs=1 l=1).process got front >> 111623 >> 2017-09-05 10:11:00.955702 7f898c43d700 25 -- 192.168.2.1:6805/7391 >> >> 192.168.2.1:6811/7386 conn(0x55c5053b1800 :6805 >> s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=7 cs=1 >> l=1).read_until len is 21 state_offset is 0 >> 2017-09-05 10:11:00.955711 7f898c43d700 25 -- 192.168.2.1:6805/7391 >> >> 192.168.2.1:6811/7386 conn(0x55c5053b1800 :6805 >> s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=7 cs=1 >> l=1).read_until read_bulk recv_end is 0 left is 21 got 21 >> 2017-09-05 10:11:00.955717 7f898c43d700 10 -- 192.168.2.1:6805/7391 >> >> 192.168.2.1:6811/7386 conn(0x55c5053b1800 :6805 >> s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=7 cs=1 l=1).process >> aborted = 0 >> 2017-09-05 10:11:00.955722 7f898c43d700 20 -- 192.168.2.1:6805/7391 >> >> 192.168.2.1:6811/7386 conn(0x55c5053b1800 :6805 >> s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=7 cs=1 l=1).process >> got 111623 + 0 + 0 byte message >> 2017-09-05 10:11:00.965585 7f898c43d700 -1 >> /tmp/release/Ubuntu/WORKDIR/ceph-12.0.2-32-ge885403/src/auth/Crypto.h: >> In function 'int CryptoKey::encrypt(CephContext*, const bufferlist&, >> ceph::bufferlist&, std::__cxx11::string*) const' thread 7f898c43d700 >> time 2017-09-05 10:11:00.957267 >> /tmp/release/Ubuntu/WORKDIR/ceph-12.0.2-32-ge885403/src/auth/Crypto.h: >> 109: FAILED assert(ckh) > > There were several bugs in this area during the Luminous development > cycle that have been fixed for final release. Since it looks like > you're using a *very* old development release, you should update! :) > > (Why were you running that on a cluster anyway? I hope it's just for > development purposes!) > -Greg > >> >> ceph version 12.0.2 (5a1b6b3269da99a18984c138c23935e5eb96f73e) >> 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char >> const*)+0x102) [0x55c4f5cd84d2] >> 2: (CephxSessionHandler::_calc_signature(Message*, unsigned >> long*)+0x5d2) [0x55c4f5ef7932] >> 3: (CephxSessionHandler::check_message_signature(Message*)+0x7d) >> [0x55c4f5ef7cfd] >> 4: (AsyncConnection::process()+0x1dbc) [0x55c4f5edf82c] >> 5: (EventCenter::process_events(int)+0x9b1) [0x55c4f5d70f71] >> 6: (()+0xdb5e41) [0x55c4f5d74e41] >> 7: (()+0xb8c80) [0x7f898f0c2c80] >> 8: (()+0x76ba) [0x7f898f7b16ba] >> 9: (clone()+0x6d) [0x7f898e82882d] >> >> Thanks & Regards >> -- >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >> the body of a message to majordomo@xxxxxxxxxxxxxxx >> More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html