Re: RDMA/Infiniband status

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,

On Thu, 9 Jun 2016 20:28:41 +0200 Daniel Swarbrick wrote:

> On 09/06/16 17:01, Gandalf Corvotempesta wrote:
> > Il 09 giu 2016 15:41, "Adam Tygart" <mozes@xxxxxxx
> > <mailto:mozes@xxxxxxx>> ha scritto:
> >>
> >> If you're
> >> using pure DDR, you may need to tune the broadcast group in your
> >> subnet manager to set the speed to DDR.
> > 
> > Do you know how to set this with opensm?
> > I would like to bring up my test cluster again next days
> > 
> 
> IB partitions and their IPoIB flags are defined in
> /etc/opensm/partitions.conf.
> 
> See
> http://git.openfabrics.org/?p=~halr/opensm.git;a=blob_plain;f=doc/partition-config.txt;hb=HEAD
> 
> Note that the rate only applies to multicast or broadcast traffic. You
> should consider increasing the rate if, for example, you know you only
> have QDR or FDR hosts on the fabric.
> 
> Unicast traffic flows will run at the maximum speed supported by both
> peers.
>

This thread brings back memories of this one:
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2016-April/008792.html

According to Robert IPoIB still uses IB multicast under the hood even when
from an IP perspective traffic would be unicast.

The biggest issue pointed out in that mail and the immensely long
and complex thread he mentioned in it is that you can't change the speed
settings on the fly.
which means that if you're already in production it's unlikely that there
will ever be a time to entirely tear down your IB network...
 
> If using IPoIB CM (Connected Mode), you will most likely have an MTU of
> 65520 on your interfaces. Since multicast cannot use CM, if you try to
> send multicast packets larger than the partition's mcast mtu (2044 by
> default), they will be dropped.
> 
I'm using CM (the bandwidth is atrocious otherwise) and that's not an
issue with Ceph. 
The only time I've actually noticed it is with pacemaker and really huge
heartbeat broadcasts. 

I'll be getting a decent test and staging cluster in a few weeks and will
run excessive tests pertaining to all of this, that is:

- Untuned speeds (CM and not) 
- Tuned (QDR) speeds
- Confirm that live changes are not possible

Christian

> For simplicity, I would recommend using ipv6 addressing on your IPoIB,
> as it maps much more sanely to IB GIDs / MGIDs.
> 
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 


-- 
Christian Balzer        Network/Systems Engineer                
chibi@xxxxxxx   	Global OnLine Japan/Rakuten Communications
http://www.gol.com/
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux