Re: Setting up a small experimental CEPH network

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> we use heavily bonded interfaces (6x10G) and also needed to look at this balancing question. We use LACP bonding and, while the host OS probably tries to balance outgoing traffic over all NICs

> I tested something in the past[1] where I could notice that an osd
> staturated a bond link and did not use the available 2nd one.

This is exactly what I wrote about, and it doesn’t have to be this way.

When using Linux bonding, be sure to set the xmit hash policy to layer3+4 and the mode on both sides to active/active.  

active/backup cuts your potential bandwidth, and a layer 1 / config problem on the backup link will be latent until you need it most, eg. when you do switch maintenance and assume that your bonds will all failover for continuity.  


> However it has been written here multiple times that these osd processes
> are single thread, so afaik they cannot use more than on link, and at
> the moment your osd has a saturated link, your clients will notice this.

Threads have nothing to do with links. Even if they did, real clusters have multiple OSDs per node, right?

> [1] https://www.mail-archive.com/ceph-users@xxxxxxxxxxxxxx/msg35474.html


Context, brother, context.

"1 osd per node cluster” 

“This is typical for a 'single line of communication' using lacp. Afaik 
the streams to the nodes are independent from each other anyway, so 
maybe it is possible to 'fork' the transmitting process, so linux can 
detect it as a separate stream and thus use the other link.”


This is NOT typical of production, WDL’s microserver experiment notwithstanding.


In real life, you’re going to have, what, at least 8 OSDs per node?  Each with streams to multiple clients (and other OSDs).  With dozens/hundreds/thousands of streams and a proper bonding (or equal-cost routing) setup, the *streams* are going to be hashed across available links by the bonding driver.


> 
> 
> 
> -----Original Message-----
> From: Lindsay Mathieson [mailto:lindsay.mathieson@xxxxxxxxx]
> Sent: maandag 21 september 2020 2:42
> To: ceph-users@xxxxxxx
> Subject:  Re: Setting up a small experimental CEPH network
> 
> On 21/09/2020 5:40 am, Stefan Kooman wrote:
>> My experience with bonding and Ceph is pretty good (OpenvSwitch). Ceph
> 
>> uses lots of tcp connections, and those can get shifted (balanced)
>> between interfaces depending on load.
> 
> Same here - I'm running 4*1GB (LACP, Balance-TCP) on a 5 node cluster
> with 19 OSD's. 20 Active VM's and it idles at under 1 MiB/s, spikes up
> to 100MiB/s no problem. When doing a heavy rebalance/repair data rates
> on any one node can hit 400MiBs+
> 
> 
> It scales out really well.
> 
> --
> Lindsay
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an
> email to ceph-users-leave@xxxxxxx
> 
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux