Haomai, ibstat CA 'mlx4_0' CA type: MT4103 Number of ports: 2 Firmware version: 2.40.7000 Hardware version: 0 Node GUID: 0x248a070300e26070 System image GUID: 0x248a070300e26070 Port 1: State: Active Physical state: LinkUp Rate: 56 Base lid: 0 LMC: 0 SM lid: 0 Capability mask: 0x04010000 Port GUID: 0x268a07fffee26070 Link layer: Ethernet Port 2: State: Active Physical state: LinkUp Rate: 56 Base lid: 0 LMC: 0 SM lid: 0 Capability mask: 0x04010000 Port GUID: 0x268a07fffee26071 Link layer: Ethernet Port2 is ceph Port1 is proxmox cluster ineterface ... Gerhard W. Recher net4sec UG (haftungsbeschränkt) Leitenweg 6 86929 Penzing +49 171 4802507 Am 27.09.2017 um 14:50 schrieb Haomai Wang: > On Wed, Sep 27, 2017 at 8:33 PM, Gerhard W. Recher > <gerhard.recher@xxxxxxxxxxx> wrote: >> Hi Folks! >> >> I'm totally stuck >> >> rdma is running on my nics, rping udaddy etc will give positive results. >> >> cluster consist of: >> proxmox-ve: 5.0-23 (running kernel: 4.10.17-3-pve) >> pve-manager: 5.0-32 (running version: 5.0-32/2560e073) >> >> system(4 nodes): Supermicro 2028U-TN24R4T+ >> >> 2 port Mellanox connect x3pro 56Gbit >> 4 port intel 10GigE >> memory: 768 GBytes >> CPU DUAL Intel(R) Xeon(R) CPU E5-2690 v4 @ 2.60GHz >> >> ceph: 28 osds >> 24 Intel Nvme 2000GB Intel SSD DC P3520, 2,5", PCIe 3.0 x4, >> 4 Intel Nvme 1,6TB Intel SSD DC P3700, 2,5", U.2 PCIe 3.0 >> >> >> ceph is running on bluestore, engaging rdma within ceph (version >> 12.2.0-pve1) will lead into this crash >> >> >> ceph.conf: >> [global] >> ms_type=async+rdma >> ms_cluster_type = async+rdma >> ms_async_rdma_port_num=2 > I guess it should be 0. what's your result of "ibstat" > >> ms_async_rdma_device_name=mlx4_0 >> ... >> >> >> >> -- Reboot -- >> Sep 26 18:56:10 pve02 systemd[1]: Started Ceph cluster manager daemon. >> Sep 26 18:56:10 pve02 systemd[1]: Reached target ceph target allowing to start/stop all ceph-mgr@.service instances at once. >> Sep 26 18:56:10 pve02 ceph-mgr[2233]: 2017-09-26 18:56:10.427474 7f0e2137e700 -1 Infiniband binding_port port not found >> Sep 26 18:56:10 pve02 ceph-mgr[2233]: /home/builder/source/ceph-12.2.0/src/msg/async/rdma/Infiniband.cc: In function 'void Device::binding_port(CephContext*, int)' thread 7f0e2137e700 time 2017-09-26 18:56:10.427498 >> Sep 26 18:56:10 pve02 ceph-mgr[2233]: /home/builder/source/ceph-12.2.0/src/msg/async/rdma/Infiniband.cc: 144: FAILED assert(active_port) >> Sep 26 18:56:10 pve02 ceph-mgr[2233]: ceph version 12.2.0 (36f6c5ea099d43087ff0276121fd34e71668ae0e) luminous (rc) >> Sep 26 18:56:10 pve02 ceph-mgr[2233]: 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x102) [0x55e9dde4bd12] >> Sep 26 18:56:10 pve02 ceph-mgr[2233]: 2: (Device::binding_port(CephContext*, int)+0x573) [0x55e9de1b2c33] >> Sep 26 18:56:10 pve02 ceph-mgr[2233]: 3: (Infiniband::init()+0x15f) [0x55e9de1b8f1f] >> Sep 26 18:56:10 pve02 ceph-mgr[2233]: 4: (RDMAWorker::connect(entity_addr_t const&, SocketOptions const&, ConnectedSocket*)+0x4c) [0x55e9ddf2329c] >> Sep 26 18:56:10 pve02 ceph-mgr[2233]: 5: (AsyncConnection::_process_connection()+0x446) [0x55e9de1a6d86] >> Sep 26 18:56:10 pve02 ceph-mgr[2233]: 6: (AsyncConnection::process()+0x7f8) [0x55e9de1ac328] >> Sep 26 18:56:10 pve02 ceph-mgr[2233]: 7: (EventCenter::process_events(int, std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> >*)+0x1125) [0x55e9ddf198a5] >> Sep 26 18:56:10 pve02 ceph-mgr[2233]: 8: (()+0x4c9288) [0x55e9ddf1d288] >> Sep 26 18:56:10 pve02 ceph-mgr[2233]: 9: (()+0xb9e6f) [0x7f0e259d4e6f] >> Sep 26 18:56:10 pve02 ceph-mgr[2233]: 10: (()+0x7494) [0x7f0e260d1494] >> Sep 26 18:56:10 pve02 ceph-mgr[2233]: 11: (clone()+0x3f) [0x7f0e25149aff] >> Sep 26 18:56:10 pve02 ceph-mgr[2233]: NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. >> >> >> any advice ? >> >> >> -- >> Gerhard W. Recher >> >> net4sec UG (haftungsbeschränkt) >> Leitenweg 6 >> 86929 Penzing >> >> +49 171 4802507 >> >> >> _______________________________________________ >> ceph-users mailing list >> ceph-users@xxxxxxxxxxxxxx >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>
Attachment:
smime.p7s
Description: S/MIME Cryptographic Signature
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com