Re: ceph rdma network connect refused

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



What do show_gids command show?
Or did you try the higher version?




------------------ Original ------------------
From:                                                                                                                        "xl_3992@xxxxxx"                                                                                    <xl_3992@xxxxxx&gt;;
Date:&nbsp;Mon, Nov 29, 2021 06:32 PM
To:&nbsp;"GHui"<ugiwgh@xxxxxx&gt;;
Cc:&nbsp;"ceph-users"<ceph-users@xxxxxxx&gt;;
Subject:&nbsp;Re:  Re: ceph rdma  network connect refused



 ceph version:


[store@xxxxxxxxxxxxxxxxxxxx ~]$ sudo ceph versions
{
&nbsp; &nbsp; "mon": {
&nbsp; &nbsp; &nbsp; &nbsp; "ceph version 14.2.22 (ca74598065096e6fcbd8433c8779a2be0c889351) nautilus (stable)": 3
&nbsp; &nbsp; },
&nbsp; &nbsp; "mgr": {
&nbsp; &nbsp; &nbsp; &nbsp; "ceph version 14.2.22 (ca74598065096e6fcbd8433c8779a2be0c889351) nautilus (stable)": 3
&nbsp; &nbsp; },
&nbsp; &nbsp; "osd": {
&nbsp; &nbsp; &nbsp; &nbsp; "ceph version 14.2.22 (ca74598065096e6fcbd8433c8779a2be0c889351) nautilus (stable)": 1890
&nbsp; &nbsp; },
&nbsp; &nbsp; "mds": {},
&nbsp; &nbsp; "overall": {
&nbsp; &nbsp; &nbsp; &nbsp; "ceph version 14.2.22 (ca74598065096e6fcbd8433c8779a2be0c889351) nautilus (stable)": 1896
&nbsp; &nbsp; }
}
[store@xxxxxxxxxxxxxxxxxxxx ~]$



Reserved memory not enough ?

 

thansk to reply ~



 xl_3992@xxxxxx


 &nbsp;
From:&nbsp;GHui
Date:&nbsp;2021-11-29 18:19
To:&nbsp;xl_3992@xxxxxx
CC:&nbsp;ceph-users
Subject:&nbsp; Re: ceph rdma  network connect refused


Which Ceph version do you use? Or where container images did you download?
 &nbsp;
 &nbsp;
 &nbsp;
 ------------------ Original ------------------
 From: "xl_3992@xxxxxx" <xl_3992@xxxxxx&gt;;
 Date: Mon, Nov 29, 2021 11:27 AM
 To: "ceph-users"<ceph-users@xxxxxxx&gt;;
 Subject:  ceph rdma network connect refused
 &nbsp;
 I test rdma network with ceph, when nodes exceed 16, most of osds down; when nodes less 16 nodes , cluster health is ok;
 who can help me?
 &nbsp;
 &nbsp;
 error log output :
 &nbsp;
 2021-11-29 10:53:06.884 7f0839fec700 -1 --2- 10.94.48.70:0/559149 &gt;&gt; [v2:10.94.48.66:7045/3543288,v1:10.94.48.66:7047/3543288] conn(0x5585a4b3ec00 0x5585bd816700 unknown :-1 s=BANNER_CONNECTING pgs=0 cs=0 l=1 rev1=0 rx=0 tx=0)._handle_peer_banner peer [v2:10.94.48.66:7045/3543288,v1:10.94.48.66:7047/3543288] is using msgr V1 protocol
 2021-11-29 10:53:07.264 7f083a7ed700 -1 Infiniband send_msg send returned error 111: (111) Connection refused
 2021-11-29 10:53:07.264 7f083a7ed700 -1 Infiniband send_msg send returned error 111: (111) Connection refused
 2021-11-29 10:53:07.264 7f083a7ed700 -1 Infiniband send_msg send returned error 111: (111) Connection refused
 2021-11-29 10:53:07.264 7f083a7ed700 -1 Infiniband send_msg send returned error 111: (111) Connection refused
 2021-11-29 10:53:07.264 7f083a7ed700 -1 Infiniband send_msg send returned error 111: (111) Connection refused
 &nbsp;
 follow “Bring Up Ceph RDMA - Developer's Guide”, my cluster conf:
 &nbsp;
 #----------------------- RDMA ---------------------
 ms_type = async+rdma
 ms_cluster_type = async+rdma
 ms_public_type = async+rdma
 ms_async_rdma_device_name = mlx5_bond_0
 ms_async_rdma_polling_us = 0
 ms_async_rdma_local_gid = 0000:0000:0000:0000:0000:ffff:0a5e:3046
 &nbsp;
 [osd]
 osd_memory_target = 4294967296
 &nbsp;
 nodes env:
 [store@xxxxxxxxxxxxxxxxxxxx ~]$ ulimit
 unlimited
 [store@xxxxxxxxxxxxxxxxxxxx ~]$ ibdev2netdev
 mlx5_0 port 1 ==&gt; enp94s0f0 (Down)
 mlx5_1 port 1 ==&gt; enp94s0f1 (Down)
 mlx5_bond_0 port 1 ==&gt; bond0 (Up)
 &nbsp;
 [store@xxxxxxxxxxxxxxxxxxxx ~]$ sudo cat /usr/lib/systemd/system/ceph-osd@.service
 [Unit]
 Description=Ceph object storage daemon osd.%i
 PartOf=ceph-osd.target
 After=network-online.target local-fs.target time-sync.target
 Before=remote-fs-pre.target ceph-osd.target
 Wants=network-online.target local-fs.target time-sync.target remote-fs-pre.target ceph-osd.target
 &nbsp;
 [Service]
 LimitNOFILE=1048576
 LimitNPROC=1048576
 LimitMEMLOCK=infinity
 EnvironmentFile=-/etc/sysconfig/ceph
 Environment=CLUSTER=ceph
 ExecStart=/usr/bin/ceph-osd -f --cluster ${CLUSTER} --id %i --setuser ceph --setgroup ceph
 ExecStartPre=/usr/lib/ceph/ceph-osd-prestart.sh --cluster ${CLUSTER} --id %i
 ExecReload=/bin/kill -HUP $MAINPID
 LockPersonality=true
 MemoryDenyWriteExecute=true
 &nbsp;
 [Install]
 WantedBy=ceph-osd.target
 &nbsp;
 [store@xxxxxxxxxxxxxxxxxxxx ~]$ cat&nbsp; /etc/security/limits.conf
 root soft nofile 10000000
 root hard nofile 10000000
 &nbsp;
 &nbsp;
 &nbsp;
 &nbsp;
 &nbsp;
 &nbsp;
 _______________________________________________
 ceph-users mailing list -- ceph-users@xxxxxxx
 To unsubscribe send an email to ceph-users-leave@xxxxxxx
 _______________________________________________
 ceph-users mailing list -- ceph-users@xxxxxxx
 To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux