Re: RDMA with mellanox connect x3pro on debian stretch and proxmox v5.0 kernel 4.10.17-3

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



previously we have a infiniband cluster, recently we deploy a roce
cluster. they are both test purpose for users.

On Wed, Sep 27, 2017 at 11:38 PM, Gerhard W. Recher
<gerhard.recher@xxxxxxxxxxx> wrote:
> Haomai,
>
> I looked at your presentation, so i guess you already have a running
> cluster with RDMA & mellanox
> (https://www.youtube.com/watch?v=Qb2SUWLdDCw)
>
> Is nobody out there having a running cluster with RDMA ?
> any help is appreciated !
>
> Gerhard W. Recher
>
> net4sec UG (haftungsbeschränkt)
> Leitenweg 6
> 86929 Penzing
>
> +49 171 4802507
> Am 27.09.2017 um 16:09 schrieb Haomai Wang:
>> https://community.mellanox.com/docs/DOC-2415
>>
>> On Wed, Sep 27, 2017 at 10:01 PM, Gerhard W. Recher
>> <gerhard.recher@xxxxxxxxxxx> wrote:
>>> How to set local gid option ?
>>>
>>> I have no glue :)
>>>
>>> Gerhard W. Recher
>>>
>>> net4sec UG (haftungsbeschränkt)
>>> Leitenweg 6
>>> 86929 Penzing
>>>
>>> +49 171 4802507
>>> Am 27.09.2017 um 15:59 schrieb Haomai Wang:
>>>> do you set local gid option?
>>>>
>>>> On Wed, Sep 27, 2017 at 9:52 PM, Gerhard W. Recher
>>>> <gerhard.recher@xxxxxxxxxxx> wrote:
>>>>> Yep ROcE ....
>>>>>
>>>>> i followed up all recommendations in mellanox papers ...
>>>>>
>>>>> */etc/security/limits.conf*
>>>>>
>>>>> * soft memlock unlimited
>>>>> * hard memlock unlimited
>>>>> root soft memlock unlimited
>>>>> root hard memlock unlimited
>>>>>
>>>>>
>>>>> also set properties on daemons (chapter 11) in
>>>>> https://community.mellanox.com/docs/DOC-2721
>>>>>
>>>>>
>>>>> only gids parameter in ceph.conf is no way in proxmox, because
>>>>> cephp.conf is for all storage node the same file
>>>>> root@pve01:/etc/ceph# ls -latr
>>>>> total 16
>>>>> lrwxrwxrwx   1 root root   18 Jun 21 19:35 ceph.conf -> /etc/pve/ceph.conf
>>>>>
>>>>> and each node has uniqe  GIDS.....
>>>>>
>>>>>
>>>>> ./showgids
>>>>> DEV     PORT    INDEX   GID
>>>>> IPv4            VER     DEV
>>>>> ---     ----    -----   ---
>>>>> ------------    ---     ---
>>>>> mlx4_0  1       0
>>>>> fe80:0000:0000:0000:268a:07ff:fee2:6070                 v1      ens1
>>>>> mlx4_0  1       1
>>>>> fe80:0000:0000:0000:268a:07ff:fee2:6070                 v2      ens1
>>>>> mlx4_0  1       2       0000:0000:0000:0000:0000:ffff:c0a8:dd8d
>>>>> 192.168.221.141         v1      vmbr0
>>>>> mlx4_0  1       3       0000:0000:0000:0000:0000:ffff:c0a8:dd8d
>>>>> 192.168.221.141         v2      vmbr0
>>>>> mlx4_0  2       0
>>>>> fe80:0000:0000:0000:268a:07ff:fee2:6071                 v1      ens1d1
>>>>> mlx4_0  2       1
>>>>> fe80:0000:0000:0000:268a:07ff:fee2:6071                 v2      ens1d1
>>>>> mlx4_0  2       2       0000:0000:0000:0000:0000:ffff:c0a8:648d
>>>>> 192.168.100.141         v1      ens1d1
>>>>> mlx4_0  2       3       0000:0000:0000:0000:0000:ffff:c0a8:648d
>>>>> 192.168.100.141         v2      ens1d1
>>>>> n_gids_found=8
>>>>>
>>>>> next node ... showgids
>>>>> ./showgids
>>>>> DEV     PORT    INDEX   GID
>>>>> IPv4            VER     DEV
>>>>> ---     ----    -----   ---
>>>>> ------------    ---     ---
>>>>> mlx4_0  1       0
>>>>> fe80:0000:0000:0000:268a:07ff:fef9:8730                 v1      ens1
>>>>> mlx4_0  1       1
>>>>> fe80:0000:0000:0000:268a:07ff:fef9:8730                 v2      ens1
>>>>> mlx4_0  1       2       0000:0000:0000:0000:0000:ffff:c0a8:dd8e
>>>>> 192.168.221.142         v1      vmbr0
>>>>> mlx4_0  1       3       0000:0000:0000:0000:0000:ffff:c0a8:dd8e
>>>>> 192.168.221.142         v2      vmbr0
>>>>> mlx4_0  2       0
>>>>> fe80:0000:0000:0000:268a:07ff:fef9:8731                 v1      ens1d1
>>>>> mlx4_0  2       1
>>>>> fe80:0000:0000:0000:268a:07ff:fef9:8731                 v2      ens1d1
>>>>> mlx4_0  2       2       0000:0000:0000:0000:0000:ffff:c0a8:648e
>>>>> 192.168.100.142         v1      ens1d1
>>>>> mlx4_0  2       3       0000:0000:0000:0000:0000:ffff:c0a8:648e
>>>>> 192.168.100.142         v2      ens1d1
>>>>> n_gids_found=8
>>>>>
>>>>>
>>>>>
>>>>> ifconfig ens1d1
>>>>> ens1d1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 9000
>>>>>         inet 192.168.100.141  netmask 255.255.255.0  broadcast
>>>>> 192.168.100.255
>>>>>         inet6 fe80::268a:7ff:fee2:6071  prefixlen 64  scopeid 0x20<link>
>>>>>         ether 24:8a:07:e2:60:71  txqueuelen 1000  (Ethernet)
>>>>>         RX packets 25450717  bytes 39981352146 (37.2 GiB)
>>>>>         RX errors 0  dropped 77  overruns 77  frame 0
>>>>>         TX packets 26554236  bytes 53419159091 (49.7 GiB)
>>>>>         TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
>>>>>
>>>>>
>>>>>
>>>>> Gerhard W. Recher
>>>>>
>>>>> net4sec UG (haftungsbeschränkt)
>>>>> Leitenweg 6
>>>>> 86929 Penzing
>>>>>
>>>>> +49 171 4802507
>>>>> Am 27.09.2017 um 14:50 schrieb Haomai Wang:
>>>>>> On Wed, Sep 27, 2017 at 8:33 PM, Gerhard W. Recher
>>>>>> <gerhard.recher@xxxxxxxxxxx> wrote:
>>>>>>> Hi Folks!
>>>>>>>
>>>>>>> I'm totally stuck
>>>>>>>
>>>>>>> rdma is running on my nics, rping udaddy etc will give positive results.
>>>>>>>
>>>>>>> cluster consist of:
>>>>>>> proxmox-ve: 5.0-23 (running kernel: 4.10.17-3-pve)
>>>>>>> pve-manager: 5.0-32 (running version: 5.0-32/2560e073)
>>>>>>>
>>>>>>> system(4 nodes): Supermicro 2028U-TN24R4T+
>>>>>>>
>>>>>>> 2 port Mellanox connect x3pro 56Gbit
>>>>>>> 4 port intel 10GigE
>>>>>>> memory: 768 GBytes
>>>>>>> CPU DUAL  Intel(R) Xeon(R) CPU E5-2690 v4 @ 2.60GHz
>>>>>>>
>>>>>>> ceph: 28 osds
>>>>>>> 24  Intel Nvme 2000GB Intel SSD DC P3520, 2,5", PCIe 3.0 x4,
>>>>>>>  4  Intel Nvme 1,6TB Intel SSD DC P3700, 2,5", U.2 PCIe 3.0
>>>>>>>
>>>>>>>
>>>>>>> ceph is running on bluestore,  engaging rdma within ceph (version
>>>>>>> 12.2.0-pve1) will lead into this crash
>>>>>>>
>>>>>>>
>>>>>>> ceph.conf:
>>>>>>> [global]
>>>>>>> ms_type=async+rdma
>>>>>>> ms_cluster_type = async+rdma
>>>>>>> ms_async_rdma_port_num=2
>>>>>> I guess it should be 0. what's your result of "ibstat"
>>>>>>
>>>>>>> ms_async_rdma_device_name=mlx4_0
>>>>>>> ...
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> -- Reboot --
>>>>>>> Sep 26 18:56:10 pve02 systemd[1]: Started Ceph cluster manager daemon.
>>>>>>> Sep 26 18:56:10 pve02 systemd[1]: Reached target ceph target allowing to start/stop all ceph-mgr@.service instances at once.
>>>>>>> Sep 26 18:56:10 pve02 ceph-mgr[2233]: 2017-09-26 18:56:10.427474 7f0e2137e700 -1 Infiniband binding_port  port not found
>>>>>>> Sep 26 18:56:10 pve02 ceph-mgr[2233]: /home/builder/source/ceph-12.2.0/src/msg/async/rdma/Infiniband.cc: In function 'void Device::binding_port(CephContext*, int)' thread 7f0e2137e700 time 2017-09-26 18:56:10.427498
>>>>>>> Sep 26 18:56:10 pve02 ceph-mgr[2233]: /home/builder/source/ceph-12.2.0/src/msg/async/rdma/Infiniband.cc: 144: FAILED assert(active_port)
>>>>>>> Sep 26 18:56:10 pve02 ceph-mgr[2233]:  ceph version 12.2.0 (36f6c5ea099d43087ff0276121fd34e71668ae0e) luminous (rc)
>>>>>>> Sep 26 18:56:10 pve02 ceph-mgr[2233]:  1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x102) [0x55e9dde4bd12]
>>>>>>> Sep 26 18:56:10 pve02 ceph-mgr[2233]:  2: (Device::binding_port(CephContext*, int)+0x573) [0x55e9de1b2c33]
>>>>>>> Sep 26 18:56:10 pve02 ceph-mgr[2233]:  3: (Infiniband::init()+0x15f) [0x55e9de1b8f1f]
>>>>>>> Sep 26 18:56:10 pve02 ceph-mgr[2233]:  4: (RDMAWorker::connect(entity_addr_t const&, SocketOptions const&, ConnectedSocket*)+0x4c) [0x55e9ddf2329c]
>>>>>>> Sep 26 18:56:10 pve02 ceph-mgr[2233]:  5: (AsyncConnection::_process_connection()+0x446) [0x55e9de1a6d86]
>>>>>>> Sep 26 18:56:10 pve02 ceph-mgr[2233]:  6: (AsyncConnection::process()+0x7f8) [0x55e9de1ac328]
>>>>>>> Sep 26 18:56:10 pve02 ceph-mgr[2233]:  7: (EventCenter::process_events(int, std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> >*)+0x1125) [0x55e9ddf198a5]
>>>>>>> Sep 26 18:56:10 pve02 ceph-mgr[2233]:  8: (()+0x4c9288) [0x55e9ddf1d288]
>>>>>>> Sep 26 18:56:10 pve02 ceph-mgr[2233]:  9: (()+0xb9e6f) [0x7f0e259d4e6f]
>>>>>>> Sep 26 18:56:10 pve02 ceph-mgr[2233]:  10: (()+0x7494) [0x7f0e260d1494]
>>>>>>> Sep 26 18:56:10 pve02 ceph-mgr[2233]:  11: (clone()+0x3f) [0x7f0e25149aff]
>>>>>>> Sep 26 18:56:10 pve02 ceph-mgr[2233]:  NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
>>>>>>>
>>>>>>>
>>>>>>> any advice ?
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Gerhard W. Recher
>>>>>>>
>>>>>>> net4sec UG (haftungsbeschränkt)
>>>>>>> Leitenweg 6
>>>>>>> 86929 Penzing
>>>>>>>
>>>>>>> +49 171 4802507
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> ceph-users mailing list
>>>>>>> ceph-users@xxxxxxxxxxxxxx
>>>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>>>>
>>>>> _______________________________________________
>>>>> ceph-users mailing list
>>>>> ceph-users@xxxxxxxxxxxxxx
>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>>
>>>
>>>
>>> _______________________________________________
>>> ceph-users mailing list
>>> ceph-users@xxxxxxxxxxxxxx
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>
>
>
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux