Re: Write throughput drops to zero

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Здравствуйте! 

On Fri, Oct 30, 2015 at 09:30:40PM +0000, moloney wrote:

> Hi,

> I recently got my first Ceph cluster up and running and have been doing some stress tests. I quickly found that during sequential write benchmarks the throughput would often drop to zero. Initially I saw this inside QEMU virtual machines, but I can also reproduce the issue with "rados bench" within 5-10 minutes of sustained writes.  If left alone the writes will eventually start going again, but it takes quite a while (at least a couple minutes). If I stop and restart the benchmark the write throughput will immediately be where it is supposed to be.

> I have convinced myself it is not a network hardware issue.  I can load up the network with a bunch of parallel iperf benchmarks and it keeps chugging along happily. When the issue occurs with Ceph I don't see any indications of network issues (e.g. dropped packets).  Adding additional network load during the rados bench (using iperf) doesn't seem to trigger the issue any faster or more often.

> I have also convinced myself it isn't an issue with a journal getting full or an OSD being too busy.  The amount of data being written before the problem occurs is much larger than the total journal capacity. Watching the load on the OSD servers with top/iostat I don't seen anything being overloaded, rather I see the load everywhere drop to essentially zero when the writes stall. Before the writes stall the load is well distributed with no visible hot spots. The OSDs and hosts that report slow requests are random, so I don't think it is a failing disk or server.  I don't see anything interesting going on in the logs so far (I am just about to do some tests with Ceph's debug logging cranked up).

> The cluster specs are:

> OS: Ubuntu 14.04 with 3.16 kernel
> Ceph: 9.1.0
> OSD Filesystem: XFS
> Replication: 3X
> Two racks with IPoIB network
> 10Gbps Ethernet between racks
> 8 OSD servers with:
>   * Dual Xeon E5-2630L (12 cores @ 2.4GHz)
>   * 128GB RAM
>   * 12 6TB Seagate drives (connected to LSI 2208 chip in JBOD mode)
>   * Two 400GB Intel P3600 NVMe drives (OS on RAID1 partition, 6 partitions for OSD journals each)
>   * Mellanox ConnectX-3 NIC (for both Infiniband and 10Gbps Ethernet)
> 3 Mons collocated on OSD servers

> Any advice is greatly appreciated. I am planning to try this with Hammer too.

I had the same trouble with Hammer, Ubuntu 14.04 and 3.19 kernel on Supermicro
X9DRL-3F/iF with Intel 82599ES, bounded into one links to 2 different Cisco
Nexus 5020. It was finally fixed with dropping down MTU from 1500+ to 1500.
It was working with 9000 and folowing sysctls, but after several weeks trouble
repeated and I had to drop mtu down again:

net.ipv4.tcp_rmem= 1024000 8738000 1677721600                                                                                                                                                  
net.ipv4.tcp_wmem= 1024000 8738000 1677721600                                                                                                                                                  
net.ipv4.tcp_mem= 1024000 8738000 1677721600                                                                                                                                                   
net.core.netdev_max_backlog = 250000
net.ipv4.tcp_max_syn_backlog = 150000
net.ipv4.tcp_congestion_control=htcp
net.ipv4.tcp_mtu_probing=1
net.ipv4.tcp_max_tw_buckets = 2000000
net.ipv4.tcp_fin_timeout = 10
net.ipv4.tcp_slow_start_after_idle = 0
net.ipv4.tcp_low_latency = 1
vm.swappiness = 1
net.ipv4.tcp_moderate_rcvbuf = 0

All 

> Thanks,
> Brendan

> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


-- 
WBR, Max A. Krasilnikov
ColoCall Data Center
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux