Re: [SOLVED] output discards (queue drops) on switchport

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

flow control was active on the NIC but not on the switch.

Enabling flowcontrol for both direction solved the problem:
	flowcontrol receive on
	flowcontrol send on

Port        Send FlowControl  Receive FlowControl  RxPause       TxPause
            admin    oper     admin    oper
----------  -------- -------- -------- --------    ------------- -------------
Et17/1      on       on       on       on          0             64500
Et17/2      on       on       on       on          0             33746
Et17/3      on       on       on       on          0             17126
Et18/1      on       on       on       on          0             36948
Et18/2      on       on       on       on          0             39628

Regards,
Andreas


On 08.09.2017 13:57, Andreas Herrmann wrote:
> Hello,
> 
> I have a fresh Proxmox installation on 5 servers (Supermciro X10SRW-F, Xeon
> E5-1660 v4, 128 GB RAM) with each 8 Samsung SSD SM863 960GB connected to a
> LSI-9300-8i (SAS3008) controller used as OSDs for Ceph (12.1.2)
> 
> The servers are connected to two Arista DCS-7060CX-32S switches. I'm using
> MLAG bond (bondmode LACP, xmit_hash_policy layer3+4, MTU 9000):
>  * backend network for Ceph: cluster network & public network
>    Mellanox ConnectX-4 Lx dual-port 25 GBit/s
>  * frontend network: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ dual-port
> 
> Ceph is quite a default installation with size=3.
> 
> My problem:
> I'm issuing a dd (dd if=/dev/urandom of=urandom.0 bs=10M count=1024) in a test
> virtual machine (the only one running in the cluster) with arround 210 MB/s. I
> get output drops on all switchports. The drop rate is between 0.1 - 0.9 %. The
> drop rate of 0.9 % is reached when writing with about 1300MB/s into ceph.
> 
> First I thought about a problem with the Mellanox cards and used the Intel
> cards for ceph traffic. The problem also exists.
> 
> I tried quite a lot and nothing help:
>  * changed the MTU from 9000 to 1500
>  * changed bond_xmit_hash_policy from layer3+4 to layer2+3
>  * deactivated the bond and just used a single link
>  * disabled offloading
>  * disabled power management in BIOS
>  * perf-bias 0
> 
> I analyzed the traffic via tcpdump and got some of those "errors":
>  * TCP Previous segment not captured
>  * TCP Out-of-Order
>  * TCP Retransmission
>  * TCP Fast Retransmission
>  * TCP Dup ACK
>  * TCP ACKed unseen segment
>  * TCP Window Update
> 
> Is that behaviour normal for ceph or has anyone ideas how to solve that
> problem with the output drops at switch-side
> 
> With iperf I can reach full 50 GBit/s on the bond with zero output drops.
> 
> Andreas
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux