Hi, Have you checked all of the namespaces? lsns -t net nsenter -t <pid> -n ss -nlp Cheers, Josef > På 2022-06-02 17:00 skrev Patrick Vranckx <patrick.vranckx@xxxxxxxxxxxx>: > > > Hi, > > On my test cluster, I migrated from Nautilus to Octopus and the > converted most of the daemons to cephadm. I got a lot of problem with > podman 1.6.4 on CentOS 7 through an https proxy because my servers are > on a private network. > > Now, I'm unable to deploy new managers and the cluster is in a bizarre > situation: > > [root@cepht003 f5a025f9-fbe8-4506-8769-453902eb28d6]# ceph -s > cluster: > id: f5a025f9-fbe8-4506-8769-453902eb28d6 > health: HEALTH_WARN > client is using insecure global_id reclaim > mons are allowing insecure global_id reclaim > failed to probe daemons or devices > 42 stray daemon(s) not managed by cephadm > 2 stray host(s) with 39 daemon(s) not managed by cephadm > 1 daemons have recently crashed > > services: > mon: 5 daemons, quorum > cepht003,cepht002,cepht001,cepht004,cephtstor01 (age 19m) > mgr: cepht004.wyibzh(active, since 29m), standbys: cepht003.aaaaaa > mds: fsdup:1 fsec:1 > {fsdup:0=fsdup.cepht001.opiyzk=up:active,fsec:0=fsec.cepht003.giatub=up:active} > 7 up:standby > osd: 40 osds: 40 up (since 92m), 40 in (since 3d) > rgw: 2 daemons active (cepht001, cepht004) > > task status: > > data: > pools: 18 pools, 577 pgs > objects: 6.32k objects, 2[root@cepht003 > f5a025f9-fbe8-4506-8769-453902eb28d6]# ceph orch ps > NAME HOST STATUS REFRESHED > AGE VERSION IMAGE NAME IMAGE ID CONTAINER ID > > mds.fdec.cepht004.vbuphb cepht004 running (62m) > 47s ago 4h 15.2.13 docker.io/ceph/ceph:v15 2cf504fded39 > 5fad10ffc981 > mds.fdec.cephtstor01.gtxsnr cephtstor01 running (24m) > 46s ago 24m 15.2.13 docker.io/ceph/ceph:v15 2cf504fded39 > 24e837f6ac8a > mds.fdup.cepht001.nydfzs cepht001 running (2h) > 47s ago 2h 15.2.13 docker.io/ceph/ceph:v15 2cf504fded39 > b1880e343ece > mds.fdup.cepht003.thsnbk cepht003 running (34m) > 45s ago 34m 15.2.13 docker.io/ceph/ceph:v15 2cf504fded39 > ddd4e395e7b3 > mds.fsdup.cepht001.opiyzk cepht001 running (4h) > 47s ago 4h 15.2.13 docker.io/ceph/ceph:v15 2cf504fded39 > ad081f718863 > mds.fsdup.cepht004.cfnxxw cepht004 running (62m) > 47s ago 20h 15.2.13 docker.io/ceph/ceph:v15 2cf504fded39 > c6feed82af8f > mds.fsec.cepht002.uebrlc cepht002 running (20m) > 47s ago 20m 15.2.13 docker.io/ceph/ceph:v15 2cf504fded39 > 836f448c5708 > mds.fsec.cepht003.giatub cepht003 running (76m) > 45s ago 5h 15.2.13 docker.io/ceph/ceph:v15 2cf504fded39 > f235957145cb > mgr.cepht003.aaaaaa cepht003 stopped 45s > ago 20h 15.2.6 quay.io/ceph/ceph:v15.2.6 f16a759354cc 770d7cf078ad > mgr.cepht004.wyibzh cepht004 unknown 47s > ago 20h 15.2.13 docker.io/ceph/ceph:v15 2cf504fded39 6baa0f625271 > mon.cepht001 cepht001 running (4h) > 47s ago 4h 15.2.13 docker.io/ceph/ceph:v15 2cf504fded39 > e7f24769153c > mon.cepht002 cepht002 running (20m) > 47s ago 20m 15.2.13 docker.io/ceph/ceph:v15 2cf504fded39 > dbb5be113201 > mon.cepht003 cepht003 running (76m) > 45s ago 5h 15.2.13 docker.io/ceph/ceph:v15 2cf504fded39 > 6c2d6707b3fe > mon.cepht004 cepht004 running (62m) > 47s ago 4h 15.2.13 docker.io/ceph/ceph:v15 2cf504fded39 > 7986b598fd17 > mon.cephtstor01 cephtstor01 running (93m) > 46s ago 2h 15.2.13 docker.io/ceph/ceph:v15 2cf504fded39 > dbd9255aab10 > osd.10 cephtstor01 running (93m) > 46s ago 2h 15.2.16 quay.io/ceph/ceph:v15 8d5775c85c6a > 01b07c4a75f7 4 GiB > usage: 80 GiB used, 102 TiB / 102 TiB avail > pgs: 577 active+clean > > > When I try to create a new mgr, I get : > > [ceph: root@cepht002 /]# ceph orch daemon add mgr cepht002 > Error EINVAL: cephadm exited with an error code: 1, stderr:Deploy daemon > mgr.cepht002.kqhnbt ... > Verifying port 8443 ... > ERROR: TCP Port(s) '8443' required for mgr already in use > > But nothing runs on that port: > > [root@cepht002 f5a025f9-fbe8-4506-8769-453902eb28d6]# ss -lntu > Netid State Recv-Q Send-Q Local Address:Port Peer Address:Port > udp UNCONN 0 0 127.0.0.1:323 *:* > tcp LISTEN 0 128 192.168.64.152:6789 *:* > tcp LISTEN 0 128 192.168.64.152:6800 *:* > tcp LISTEN 0 128 192.168.64.152:6801 *:* > tcp LISTEN 0 128 *:22 *:* > tcp LISTEN 0 100 127.0.0.1:25 *:* > tcp LISTEN 0 128 127.0.0.1:6010 *:* > tcp LISTEN 0 128 *:10050 *:* > tcp LISTEN 0 128 192.168.64.152:3300 *:* > > I get the same error with the command "ceph orch apply mgr ...". The > same for each node in the cluster. > > I find no answer on Google... > > Any idea ? > > Patrick > > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx