Sorry, I dind't see that you use proxmox5. As I'm a proxmox contributor, I can tell you that I have error with kernel 4.10 (which is ubuntu kernel). if you don't use zfs, try kernel 4.12 from stretch-backports, or kernel 4.4 from proxmox 4 (with zfs support). Tell me if it's works better for you. (I'm currently try to backports last mlx5 patches from kernel 4.12 to kernel 4.10, to see if it's helping) I have open a thread on pve-devel mailing list today. ----- Mail original ----- De: "Alexandre Derumier" <aderumier@xxxxxxxxx> À: "Burkhard Linke" <Burkhard.Linke@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx> Cc: "ceph-users" <ceph-users@xxxxxxxxxxxxxx> Envoyé: Vendredi 8 Septembre 2017 17:27:49 Objet: Re: output discards (queue drops) on switchport Hi, >> public network Mellanox ConnectX-4 Lx dual-port 25 GBit/s which kernel/distro do you use ? I have same card, and I had problem with centos7 kernel 3.10 recently, with packet drop i have also problems with ubuntu kernel 4.10 and lacp kernel 4.4 or 4.12 are working fine for me. ----- Mail original ----- De: "Burkhard Linke" <Burkhard.Linke@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx> À: "ceph-users" <ceph-users@xxxxxxxxxxxxxx> Envoyé: Vendredi 8 Septembre 2017 16:25:31 Objet: Re: output discards (queue drops) on switchport Hi, On 09/08/2017 04:13 PM, Andreas Herrmann wrote: > Hi, > > On 08.09.2017 15:59, Burkhard Linke wrote: >> On 09/08/2017 02:12 PM, Marc Roos wrote: >>> >>> Afaik ceph is is not supporting/working with bonding. >>> >>> https://www.mail-archive.com/ceph-users@xxxxxxxxxxxxxx/msg35474.html >>> (thread: Maybe some tuning for bonded network adapters) >> CEPH works well with LACP bonds. The problem described in that thread is the >> fact that LACP is not using links in a round robin fashion, but distributes >> network stream depending on a hash of certain parameters like source and >> destination IP address. This is already set to layer3+4 policy by the OP. >> >> Regarding the drops (and without any experience with neither 25GBit ethernet >> nor the Arista switches): >> Do you have corresponding input drops on the server's network ports? > No input drops, just output drop Output drops on the switch are related to input drops on the server side. If the link uses flow control and the server signals the switch that its internal buffer are full, the switch has to drop further packages if the port buffer is also filled. If there's no flow control, and the network card is not able to store the packet (full buffers...), it should be noted as overrun in the interface statistics (and if this is not correct, please correct me, I'm not a network guy....). > >> Did you tune the network settings on server side for high throughput, e.g. >> net.ipv4.tcp_rmem, wmem, ...? > sysctl tuning is disabled at the moment. I tried sysctl examples from > https://fatmin.com/2015/08/19/ceph-tcp-performance-tuning/. But there is still > the same amount of output drops. > >> And are the CPUs fast enough to handle the network traffic? > Xeon(R) CPU E5-1660 v4 @ 3.20GHz should be fast enough. But I'm unsure. It's > my first Ceph cluster. The CPU has 6 cores, and you are driving 2x 10GBit, 2x 25 GBit, the raid controller and 8 ssd based osds with it. You can use tools like atop or ntop to watch certain aspects of the system during the tests (network, cpu, disk). Regards, Burkhard _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com