Hmm, that doesn't help. Can you set ceph config set osd debug_ms 20 ceph config set osd debug_auth 20 ceph config set osd debug_monc 20 for a few minutes and ceph-post-file the osd logs? (Or send a private email with a link or something.) Thanks! sage On Wed, 3 Apr 2019, Shawn Edwards wrote: > No strange auth config: > > root@tyr-ceph-mon0:~# ceph config dump | grep -E '(auth|cephx)' > global advanced auth_client_required cephx > * > global advanced auth_cluster_required cephx > * > global advanced auth_service_required cephx > * > > All boxes are using 'minimal' ceph.conf files and centralized config. > > If you need the full config, it's here: > https://gist.github.com/lesserevil/3b82d37e517f4561ce53c81629717aae > > On Wed, Apr 3, 2019 at 4:07 PM Sage Weil <sage@xxxxxxxxxxxx> wrote: > > > On Wed, 3 Apr 2019, Shawn Edwards wrote: > > > Recent nautilus upgrade from mimic. No issues on mimic. > > > > > > Now getting this or similar in all osd logs, there is very little osd > > > communicatoin, and most of the PG are either 'down' or 'unknown', even > > > though I can see the data on the filestores. > > > > > > 2019-04-03 13:47:55.280 7f13346e3700 0 --1- [v2: > > > 10.36.9.26:6802/3107,v1:10.36.9.26:6803/3107] >> v1:10.36.9.37:6821/8825 > > > conn(0xa7132000 0xa6b28000 :-1 s=CONNECTING_SEND_CONNECT_MSG pgs=0 cs=0 > > > l=0).handle_connect_reply_2 connect got BADAUTHORIZER > > > 2019-04-03 13:47:55.296 7f1333ee2700 0 --1- [v2: > > > 10.36.9.26:6802/3107,v1:10.36.9.26:6803/3107] >> v1: > > 10.36.9.37:6841/11204 > > > conn(0xa9826d00 0xa9b78000 :-1 s=CONNECTING_SEND_CONNECT_MSG pgs=0 cs=0 > > > l=0).handle_connect_reply_2 connect got BADAUTHORIZER > > > 2019-04-03 13:47:55.340 7f13346e3700 0 --1- [v2: > > > 10.36.9.26:6802/3107,v1:10.36.9.26:6803/3107] >> v1:10.36.9.37:6829/8425 > > > conn(0xa7997180 0xaeb22800 :-1 s=CONNECTING_SEND_CONNECT_MSG pgs=0 cs=0 > > > l=0).handle_connect_reply_2 connect got BADAUTHORIZER > > > 2019-04-03 13:47:55.428 7f1334ee4700 0 auth: could not find > > secret_id=41687 > > > 2019-04-03 13:47:55.428 7f1334ee4700 0 cephx: verify_authorizer could > > not > > > get service secret for service osd secret_id=41687 > > > 2019-04-03 13:47:55.428 7f1334ee4700 0 --1- [v2: > > > 10.36.9.26:6802/3107,v1:10.36.9.26:6803/3107] >> v1: > > 10.36.9.48:6805/49547 > > > conn(0xe02f24480 0xe088cb800 :6803 s=ACCEPTING_WAIT_CONNECT_MSG_AUTH > > pgs=0 > > > cs=0 l=0).handle_connect_message_2: got bad authorizer, auth_reply_len=0 > > > > > > Thoughts? I have confirmed that all ceph boxes have good time sync. > > > > Do you have any non-default auth-related settings in ceph.conf? > > > > sage > > > > > -- > Shawn Edwards > Beware programmers with screwdrivers. They tend to spill them on their > keyboards. > _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com