Re: output discards (queue drops) on switchport

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

>> public network Mellanox ConnectX-4 Lx dual-port 25 GBit/s

which kernel/distro do you use ?

I have same card, and I had problem with centos7 kernel 3.10 recently, with packet drop

i have also problems with ubuntu kernel 4.10 and lacp 


kernel 4.4 or 4.12 are working fine for me.





----- Mail original -----
De: "Burkhard Linke" <Burkhard.Linke@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx>
À: "ceph-users" <ceph-users@xxxxxxxxxxxxxx>
Envoyé: Vendredi 8 Septembre 2017 16:25:31
Objet: Re:  output discards (queue drops) on switchport

Hi, 


On 09/08/2017 04:13 PM, Andreas Herrmann wrote: 
> Hi, 
> 
> On 08.09.2017 15:59, Burkhard Linke wrote: 
>> On 09/08/2017 02:12 PM, Marc Roos wrote: 
>>> 
>>> Afaik ceph is is not supporting/working with bonding. 
>>> 
>>> https://www.mail-archive.com/ceph-users@xxxxxxxxxxxxxx/msg35474.html 
>>> (thread: Maybe some tuning for bonded network adapters) 
>> CEPH works well with LACP bonds. The problem described in that thread is the 
>> fact that LACP is not using links in a round robin fashion, but distributes 
>> network stream depending on a hash of certain parameters like source and 
>> destination IP address. This is already set to layer3+4 policy by the OP. 
>> 
>> Regarding the drops (and without any experience with neither 25GBit ethernet 
>> nor the Arista switches): 
>> Do you have corresponding input drops on the server's network ports? 
> No input drops, just output drop 
Output drops on the switch are related to input drops on the server 
side. If the link uses flow control and the server signals the switch 
that its internal buffer are full, the switch has to drop further 
packages if the port buffer is also filled. If there's no flow control, 
and the network card is not able to store the packet (full buffers...), 
it should be noted as overrun in the interface statistics (and if this 
is not correct, please correct me, I'm not a network guy....). 

> 
>> Did you tune the network settings on server side for high throughput, e.g. 
>> net.ipv4.tcp_rmem, wmem, ...? 
> sysctl tuning is disabled at the moment. I tried sysctl examples from 
> https://fatmin.com/2015/08/19/ceph-tcp-performance-tuning/. But there is still 
> the same amount of output drops. 
> 
>> And are the CPUs fast enough to handle the network traffic? 
> Xeon(R) CPU E5-1660 v4 @ 3.20GHz should be fast enough. But I'm unsure. It's 
> my first Ceph cluster. 
The CPU has 6 cores, and you are driving 2x 10GBit, 2x 25 GBit, the raid 
controller and 8 ssd based osds with it. You can use tools like atop or 
ntop to watch certain aspects of the system during the tests (network, 
cpu, disk). 

Regards, 
Burkhard 
_______________________________________________ 
ceph-users mailing list 
ceph-users@xxxxxxxxxxxxxx 
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com 

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux