Re: Setting up a small experimental CEPH network

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi all,

we use heavily bonded interfaces (6x10G) and also needed to look at this balancing question. We use LACP bonding and, while the host OS probably tries to balance outgoing traffic over all NICs, the real decision is made by the switches (incoming traffic). Our switches hash packets to a port by (source?) MAC address, meaning that it is not the number of TCP/IP connections that helps balancing, but only the number of MAC addresses. In an LACP bond, all NICs have the same MAC address and balancing happens by (physical) host. The more hosts, the better it will work.

In a way, for us this is a problem and not at the same time. We have about 550 physical clients (an HPC cluster) and 12 OSD hosts, which means that we probably have a good load on every single NIC for client traffic.

On the other hand, rebalancing between 12 servers is unlikely to use all NICs effectively. So far, we don't have enough disks per host to notice that, but it could become visible at some point. Basically, the host with the worst switch-sided hashing for incoming traffic will become the bottleneck.

On some switches the hashing method for LACP bonds can be configured, however, not with much detail. I have not seen a possibility to use IP:PORT for hashing to a switch port.

I have no experience with bonding mode 6 (ALB) that might provide a per-connection hashing. Would be interested to hear how it performs.

Best regards,
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14

________________________________________
From: Marc Roos <M.Roos@xxxxxxxxxxxxxxxxx>
Sent: 21 September 2020 11:08:55
To: ceph-users; lindsay.mathieson
Subject:  Re: Setting up a small experimental CEPH network

I tested something in the past[1] where I could notice that an osd
staturated a bond link and did not use the available 2nd one. I think I
maybe made a mistake in writing down it was a 1x replicated pool.
However it has been written here multiple times that these osd processes
are single thread, so afaik they cannot use more than on link, and at
the moment your osd has a saturated link, your clients will notice this.


[1]
https://www.mail-archive.com/ceph-users@xxxxxxxxxxxxxx/msg35474.html



-----Original Message-----
From: Lindsay Mathieson [mailto:lindsay.mathieson@xxxxxxxxx]
Sent: maandag 21 september 2020 2:42
To: ceph-users@xxxxxxx
Subject:  Re: Setting up a small experimental CEPH network

On 21/09/2020 5:40 am, Stefan Kooman wrote:
> My experience with bonding and Ceph is pretty good (OpenvSwitch). Ceph

> uses lots of tcp connections, and those can get shifted (balanced)
> between interfaces depending on load.

Same here - I'm running 4*1GB (LACP, Balance-TCP) on a 5 node cluster
with 19 OSD's. 20 Active VM's and it idles at under 1 MiB/s, spikes up
to 100MiB/s no problem. When doing a heavy rebalance/repair data rates
on any one node can hit 400MiBs+


It scales out really well.

--
Lindsay
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an
email to ceph-users-leave@xxxxxxx

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux