Hi, my Ceph cluster is in unhealthy state and busy with recovery. I'm observing the MGR log and this is showing this error message regularely: 2019-11-20 09:51:45.211 7f7205581700 0 auth: could not find secret_id=4193 2019-11-20 09:51:45.211 7f7205581700 0 cephx: verify_authorizer could not get service secret for service mgr secret_id=4193 2019-11-20 09:51:46.403 7f7205581700 0 auth: could not find secret_id=4193 2019-11-20 09:51:46.403 7f7205581700 0 cephx: verify_authorizer could not get service secret for service mgr secret_id=4193 2019-11-20 09:51:46.543 7f71f3826700 0 log_channel(cluster) log [DBG] : pgmap v2508: 8432 pgs: 1 active+recovering+remapped, 1 active+remapped+backfilling, 4 active+recovering, 2 undersized+degraded+peered, 3 remapped+peering, 104 peering, 24 activating, 3 creating+peering, 8290 active+clean; 245 TiB data, 732 TiB used, 791 TiB / 1.5 PiB avail; 67 KiB/s wr, 1 op/s; 8272/191737068 objects degraded (0.004%); 4392/191737068 objects misplaced (0.002%) 2019-11-20 09:51:46.603 7f7205d82700 0 auth: could not find secret_id=4193 2019-11-20 09:51:46.603 7f7205d82700 0 cephx: verify_authorizer could not get service secret for service mgr secret_id=4193 2019-11-20 09:51:46.947 7f7205d82700 0 auth: could not find secret_id=4193 2019-11-20 09:51:46.947 7f7205d82700 0 cephx: verify_authorizer could not get service secret for service mgr secret_id=4193 2019-11-20 09:51:47.015 7f7205d82700 0 auth: could not find secret_id=4193 2019-11-20 09:51:47.015 7f7205d82700 0 cephx: verify_authorizer could not get service secret for service mgr secret_id=4193 2019-11-20 09:51:47.815 7f7205d82700 0 auth: could not find secret_id=4193 2019-11-20 09:51:47.815 7f7205d82700 0 cephx: verify_authorizer could not get service secret for service mgr secret_id=4193 2019-11-20 09:51:48.567 7f71f3826700 0 log_channel(cluster) log [DBG] : pgmap v2509: 8432 pgs: 1 active+recovering+remapped, 1 active+remapped+backfilling, 4 active+recovering, 2 undersized+degraded+peered, 3 remapped+peering, 104 peering, 24 activating, 3 creating+peering, 8290 active+clean; 245 TiB data, 732 TiB used, 791 TiB / 1.5 PiB avail; 65 KiB/s wr, 0 op/s; 8272/191737068 objects degraded (0.004%); 4392/191737068 objects misplaced (0.002%) 2019-11-20 09:51:49.447 7f7204d80700 0 auth: could not find secret_id=4193 2019-11-20 09:51:49.447 7f7204d80700 0 cephx: verify_authorizer could not get service secret for service mgr secret_id=4193 The relevant MON log entries for this timestamp are: 2019-11-20 09:51:41.559 7f4f28311700 0 mon.ld5505@0(leader) e9 handle_command mon_command({"prefix":"df","format":"json"} v 0) v1 2019-11-20 09:51:41.559 7f4f28311700 0 log_channel(audit) log [DBG] : from='client.? 10.97.206.97:0/1141066028' entity='client.admin' cmd=[{"prefix":"df","format":"json"}]: dispatch 2019-11-20 09:51:45.847 7f4f28311700 0 mon.ld5505@0(leader) e9 handle_command mon_command({"format":"json","prefix":"df"} v 0) v1 2019-11-20 09:51:45.847 7f4f28311700 0 log_channel(audit) log [DBG] : from='client.? 10.97.206.91:0/1573121305' entity='client.admin' cmd=[{"format":"json","prefix":"df"}]: dispatch 2019-11-20 09:51:46.307 7f4f2730f700 0 --1- [v2:10.97.206.93:3300/0,v1:10.97.206.93:6789/0] >> conn(0x56253e8f5180 0x56253ebc1800 :6789 s=ACCEPTING pgs=0 cs=0 l=0).handle_client_banner accept peer addr is really - (socket is v1:10.97.206.95:51494/0) 2019-11-20 09:51:46.839 7f4f28311700 0 mon.ld5505@0(leader) e9 handle_command mon_command({"format":"json","prefix":"df"} v 0) v1 2019-11-20 09:51:46.839 7f4f28311700 0 log_channel(audit) log [DBG] : from='client.? 10.97.206.99:0/413315398' entity='client.admin' cmd=[{"format":"json","prefix":"df"}]: dispatch 2019-11-20 09:51:49.579 7f4f28311700 0 mon.ld5505@0(leader) e9 handle_command mon_command({"prefix":"df","format":"json"} v 0) v1 2019-11-20 09:51:49.579 7f4f28311700 0 log_channel(audit) log [DBG] : from='client.? 10.97.206.96:0/2753573650' entity='client.admin' cmd=[{"prefix":"df","format":"json"}]: dispatch 2019-11-20 09:51:49.607 7f4f28311700 0 mon.ld5505@0(leader) e9 handle_command mon_command({"format":"json","prefix":"df"} v 0) v1 2019-11-20 09:51:49.607 7f4f28311700 0 log_channel(audit) log [DBG] : from='client.? 10.97.206.98:0/2643276575' entity='client.admin' cmd=[{"format":"json","prefix":"df"}]: dispatch ^C2019-11-20 09:51:50.703 7f4f2730f700 0 --1- [v2:10.97.206.93:3300/0,v1:10.97.206.93:6789/0] >> conn(0x562542ed2400 0x562541a8d000 :6789 s=ACCEPTING pgs=0 cs=0 l=0).handle_client_banner accept peer addr is really - (socket is v1:10.97.206.98:52420/0) 2019-11-20 09:51:50.951 7f4f28311700 0 mon.ld5505@0(leader) e9 handle_command mon_command({"format":"json","prefix":"df"} v 0) v1 2019-11-20 09:51:50.951 7f4f28311700 0 log_channel(audit) log [DBG] : from='client.127514502 10.97.206.92:0/3526816880' entity='client.admin' cmd=[{"format":"json","prefix":"df"}]: dispatch This auth issue must be fixed soon, because if not the error occurs every second and this will interrupt Ceph's recovery and the cluster will remain unhealthy! THX _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com