Re: managing slow drives in cluster

Vijay Bellur <vbellur@xxxxxxxxxx> · Tue, 2 Aug 2016 11:13:11 -0400

On 08/01/2016 01:29 AM, Mohammed Rafi K C wrote:

On 07/30/2016 10:53 PM, Jay Berkenbilt wrote:
We're using glusterfs in Amazon EC2 and observing certain behavior
involving EBS volumes. The basic situation is that, in some cases,
clients can write data to the file system at a rate such that the
gluster daemon on one or more of the nodes may block in disk wait for
longer than 42 seconds, causing gluster to decide that the brick is
down. In fact, it's not down, it's just slow. I believe it is possible
by looking at certain system data to tell the difference from the system
with the drive on it between down and working through its queue.

We are attempting a two-pronged approach to solving this problem:

1. We would like to figure out how to tune the system, including either
or both of adjusting kernel parameters or glusterd, to try to avoid
getting the system into the state of having so much data to flush out to
disk that it blocks in disk wait for such a long time.
2. We would like to see if we can make gluster more intelligent about
responding to the pings so that the client side is still getting a
response when the remote side is just behind and not down. Though I do
understand that, in some high performance environments, one may want to
consider a disk that's not keeping up to have failed, so this may have
to be a tunable parameter.

We have a small team that has been working on this problem for a couple
of weeks. I just joined the team on Friday. I am new to gluster, but I
am not at all new to low-level system programming, Linux administration,
etc. I'm very much open to the possibility of digging into the gluster
code and supplying patches

Welcome to Gluster. It is great to see a lot of ideas within days :).

 if we can find a way to adjust the behavior
of gluster to make it behave better under these conditions.

So, here are my questions:

* Does anyone have experience with this type of issue who can offer any
suggestions on kernel parameters or gluster configurations we could play
with? We have several kernel parameters in mind and are starting to
measure their affect.
* Does anyone have any background on how we might be able to tell that
the system is getting itself into this state? Again, we have some ideas
on this already, mostly by using sysstat to monitor stuff, though
ultimately if we find a reliable way to do it, we'd probably code it
directly by looking at the relevant stuff in /proc from our own code. I
don't have the details with me right now.
* Can someone provide any pointers to where in the gluster code the ping
logic is handled and/or how one might go about making it a little smarter?

One of the user had similar problems where ping packets are queued on
waiting list because of a huge traffic. I have a patch which try to
solve the issue http://review.gluster.org/#/c/11935/ . Which is under
review and might need some more work, but I guess it is worth trying

Would it be possible to rebase this patch against the latest master? I 
am interested to see if we still see the pre-commit regression failures.

Thanks!
Vijay

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users