Re: Deadly slow Ceph cluster revisited

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Glad we were able to point you in the right direction! I would suspect a borderline cable at this point. Did you happen to notice if the interface had negotiated down to some dumb speed? If it had, I've seen cases where a dodgy cable has caused an intermittent problem that causes it to negotiate the speed downward, but then it never tries to coma back up until the interface is restarted.

fwiw, I'm using the same chipset (not onboard) and driver, and they have been rock solid so far. I would be skeptical of a driver bug as well.

QH

On Fri, Jul 17, 2015 at 2:34 PM, J David <j.david.lists@xxxxxxxxx> wrote:
On Fri, Jul 17, 2015 at 12:19 PM, Mark Nelson <mnelson@xxxxxxxxxx> wrote:
> Maybe try some iperf tests between the different OSD nodes in your
> cluster and also the client to the OSDs.

This proved to be an excellent suggestion.  One of these is not like the others:

f16 inbound: 6Gbps
f16 outbound: 6Gbps
f17 inbound: 6Gbps
f17 outbound: 6Gbps
f18 inbound: 6Gbps
f18 outbound: 1.2Mbps

There is flatly no explanation for the outbound performance on f18.
There are no errors in ifconfig/netstat, nothing logged on the switch,
etc.  Even with tcpdump running during iperf, there aren't retransmits
or anything.  It's just slow.

ifconfig'ing the primary bond interface down immediately resolved the
problem.  The iostat running in the virtual machine immediately surged
to 500+ IOPS and 40M-60M/sec.

Weirdly, ifconfig'ing the primary device back up did not bring the
problem back.  It switched back to that interface, but everything is
still fine (and iperf gives 6Gbps) at the moment.  There's no way of
telling if that will last, but it's a solid lead either way.

It's an Intel onboard dual-port X540's using the ixgbe driver.  If it
were a driver problem, we've got tons of these so I'd expect to see
this problem elsewhere.  If it's a hardware problem, ifconfig down/up
doesn't seem like it would "fix" it.  Very mysterious!

Thanks!
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux