On Tue, 31 Aug 2021 at 03:24, Arnaud MARTEL <arnaud.martel@xxxxxxxxxxxxxxxxxxxx> wrote: > > Hi Matthew, > > I dont' know if it will be helpful but I had the same problem using debian 10 and the solution was to install docker from docker.io and not from the debian package (too old). > Ah, that makes sense. Thanks! > Arnaud > > ----- Mail original ----- > De: "Matthew Pounsett" <matt@xxxxxxxxxxxxx> > À: "ceph-users" <ceph-users@xxxxxxx> > Envoyé: Lundi 30 Août 2021 17:34:32 > Objet: cephadm Pacific bootstrap hangs waiting for mon > > I'm just getting started with Pacific, and I've run into this problem > trying to get bootstrapped. cephadm is waiting for the mon to start, > and waiting, and waiting ... checking docker ps it looks like it's > running, but I guess it's never finishing its startup tasks? I > waited about 30 minutes the first time. Killed cephadm and restarted, > and I seem to have the same problem; I let it run overnight and got > some additional output that doesn't actually help me much. Details > pasted below. > > What additional things should I be doing to try to troubleshoot this? > > In case it's useful reference info, the mon IP I've given is on our > "admin" VLAN which is reachable from all hosts on our network. The > cluster network subnet I supplied is the 10G VLAN reachable only by > the servers in the ceph cluster I'm building. The IP supplied is > reachable on the local host. > > % sudo cephadm bootstrap --allow-fqdn-hostname --mon-ip 192.168.1.192 > --cluster-network 192.168.0.0/24 > Verifying podman|docker is present... > Verifying lvm2 is present... > Verifying time synchronization is in place... > Unit systemd-timesyncd.service is enabled and running > Repeating the final host check... > podman|docker (/usr/bin/docker) is present > systemctl is present > lvcreate is present > Unit systemd-timesyncd.service is enabled and running > Host looks OK > Cluster fsid: fb45c7b2-0911-11ec-9731-bc97e15d6534 > Verifying IP 192.168.1.192 port 3300 ... > Verifying IP 192.168.1.192 port 6789 ... > Mon IP `192.168.1.192` is in CIDR network `192.168.1.0/24` > Pulling container image docker.io/ceph/ceph:v16... > Ceph version: ceph version 16.2.5 > (0883bdea7337b95e4b611c768c0279868462204a) pacific (stable) > Extracting ceph user uid/gid from container image... > Creating initial keys... > Creating initial monmap... > Creating mon... > Waiting for mon to start... > Waiting for mon... > Non-zero exit code 1 from /usr/bin/docker run --rm --ipc=host > --stop-signal=SIGTERM --net=host --entrypoint /usr/bin/ceph --init -e > CONTAINER_IMAGE=docker.io/ceph/ceph:v16 -e > NODE_NAME=cmgmt01.example.net -e CEPH_USE_RANDOM_NONCE=1 -v > /var/lib/ceph/fb45c7b2-0911-11ec-9731-bc97e15d6534/mon.cmgmt01.example.net:/var/lib/ceph/mon/ceph-cmgmt01.example.net:z > -v /tmp/ceph-tmp8q3oxeg3:/etc/ceph/ceph.client.admin.keyring:z -v > /tmp/ceph-tmp4_69yc31:/etc/ceph/ceph.conf:z docker.io/ceph/ceph:v16 > status > /usr/bin/ceph: stderr 2021-08-29T21:47:23.263+0000 7f2aeaa37700 0 > monclient(hunting): authenticate timed out after 300 > /usr/bin/ceph: stderr 2021-08-29T21:52:23.262+0000 7f2aeaa37700 0 > monclient(hunting): authenticate timed out after 300 > /usr/bin/ceph: stderr 2021-08-29T21:57:23.266+0000 7f2aeaa37700 0 > monclient(hunting): authenticate timed out after 300 > /usr/bin/ceph: stderr 2021-08-29T22:02:23.265+0000 7f2aeaa37700 0 > monclient(hunting): authenticate timed out after 300 > /usr/bin/ceph: stderr 2021-08-29T22:07:23.268+0000 7f2aeaa37700 0 > monclient(hunting): authenticate timed out after 300 > /usr/bin/ceph: stderr 2021-08-29T22:12:23.268+0000 7f2aeaa37700 0 > monclient(hunting): authenticate timed out after 300 > /usr/bin/ceph: stderr 2021-08-29T22:17:23.271+0000 7f2aeaa37700 0 > monclient(hunting): authenticate timed out after 300 > /usr/bin/ceph: stderr 2021-08-29T22:22:23.266+0000 7f2aeaa37700 0 > monclient(hunting): authenticate timed out after 300 > /usr/bin/ceph: stderr 2021-08-29T22:27:23.270+0000 7f2aeaa37700 0 > monclient(hunting): authenticate timed out after 300 > /usr/bin/ceph: stderr 2021-08-29T22:32:23.273+0000 7f2aeaa37700 0 > monclient(hunting): authenticate timed out after 300 > /usr/bin/ceph: stderr [errno 110] RADOS timed out (error connecting to > the cluster) > mon not available, waiting (1/15)... > [ repeats ... ] > > The log contains identical info. The only extra I see is a note at > the end about releasing locks, which I'm sure is expected and of no > additional help. > > 2021-08-30 11:03:02,801 DEBUG Releasing lock 140656683483824 on > /run/cephadm/fb45c7b2-0911-11ec-9731-bc97e15d6534.lock > 2021-08-30 11:03:02,801 DEBUG Lock 140656683483824 released on > /run/cephadm/fb45c7b2-0911-11ec-9731-bc97e15d6534.lock > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx