Re: Ceph (cepadm) quincy: can't add osd from remote nodes.

Eugen Block <eblock@xxxxxx> · Thu, 16 Feb 2023 07:54:05 +0000

According to [1] OP already found out that ports are blocked, so this  
issue seems to be resolved.

[1]  
https://stackoverflow.com/questions/75445733/ceph-cepadm-quincy-cant-add-osd-from-remote-nodes-command-hanging

Zitat von Adam King <adking@xxxxxxxxxx>:

If it got as far as running that ceph-volume command on the remote host, I
wouldn't think it was anything with the ssh connection. Do ceph commands
generally hang on that host when you run them manually there as well?

On Wed, Feb 15, 2023 at 11:19 AM Anton Chivkunov <anton@xxxxxxxxxxxxxxxxx>
wrote:

Hello!

I stuck with a problem, while trying to create cluster of 3 nodes (AWS EC2
instancies):
fa11 ~ # ceph orch host ls
HOST  ADDR           LABELS  STATUS
fa11  172.16.24.67   _admin
fa12  172.16.23.159  _admin
fa13  172.16.25.119  _admin
3 hosts in cluster

Each of them have 2 disks (all accepted by CEPH):
fa11 ~ # ceph orch device ls
HOST  PATH          TYPE  DEVICE ID
 SIZE  AVAILABLE  REFRESHED  REJECT REASONS
fa11  /dev/nvme1n1  ssd   Amazon_Elastic_Block_Store_vol016651cf7f3b9c9dd
8589M  Yes        7m ago
fa11  /dev/nvme2n1  ssd   Amazon_Elastic_Block_Store_vol034082d7d364dfbdb
5368M  Yes        7m ago
fa12  /dev/nvme1n1  ssd   Amazon_Elastic_Block_Store_vol0ec193fa3f77fee66
8589M  Yes        3m ago
fa12  /dev/nvme2n1  ssd   Amazon_Elastic_Block_Store_vol018736f7eeab725f5
5368M  Yes        3m ago
fa13  /dev/nvme1n1  ssd   Amazon_Elastic_Block_Store_vol0443a031550be1024
8589M  Yes        84s ago
fa13  /dev/nvme2n1  ssd   Amazon_Elastic_Block_Store_vol0870412d37717dc2c
5368M  Yes        84s ago

fa11 is first host, where from I manage cluster.
Adding OSD from fa11 itself works fine:
fa11 ~ # ceph orch daemon add osd fa11:/dev/nvme1n1
Created osd(s) 0 on host 'fa11'

But it doesn't work for other 2 hosts (it hangs forever):
fa11 ~ # ceph orch daemon add osd fa12:/dev/nvme1n1
^CInterrupted

Logs on fa12 shows that it hangs at following step:
fa12 ~ # tail
/var/log/ceph/a9ef6c26-ac38-11ed-9429-06e6bc29c1db/ceph-volume.log
...
[2023-02-14 07:38:20,942][ceph_volume.process][INFO  ] Running command:
/usr/bin/ceph-authtool --gen-print-key
[2023-02-14 07:38:20,964][ceph_volume.process][INFO  ] Running command:
/usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring
/var/lib/ceph/bootstrap-osd/ceph.keyring -i - osd new
a51506c2-e910-4763-9a0c-f6c2194944e2

I'm not sure what might be the reason for this hanging?

*Additional details:
1) cephadm installed, using curl (
https://docs.ceph.com/en/quincy/cephadm/install/#curl-based-installation)
2) I use user "ceph", instead of "root" and port 2222 instead of 22. First
node was bootstrapped, using below command:
cephadm bootstrap --mon-ip 172.16.24.67 --allow-fqdn-hostname --ssh-user
ceph  --ssh-config /home/anton/ceph/ssh_config --cluster-network
172.16.16.0/20 --skip-monitoring-stack

Content of /home/anton/ceph/ssh_config:
fa11 ~ # cat /home/anton/ceph/ssh_config
Host *
  User ceph
  Port 2222
  IdentityFile /home/ceph/.ssh/id_rsa
  StrictHostKeyChecking no
  UserKnownHostsFile=/dev/null
3) Hosts fa12 and fa13 were added, using commnds:
ceph orch host add fa12.testing.swiftserve.com 172.16.23.159 --labels
_admin
ceph orch host add fa13.testing.swiftserve.com 172.16.25.119 --labels
_admin

Thanks in advance!
BR/Anton
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx