Re: Problems with ec/nfs.t in regression tests

Shyam <srangana@xxxxxxxxxx> · Thu, 12 Feb 2015 13:35:37 -0500

On 02/12/2015 01:27 PM, Xavier Hernandez wrote:
On 12.02.2015 19:09, Pranith Kumar Karampuri wrote:

On 02/12/2015 11:34 PM, Pranith Kumar Karampuri wrote:
On 02/12/2015 08:15 PM, Xavier Hernandez wrote:
I've made some more investigation and the problem seems worse. It
seems that NFS sends a huge amount of requests without waiting for
answers (I've had more than 1400 requests ongoing). Probably there
will be many factors that can influence on the load that this
causes, and one of them could be ec, but it's not related
exclusively to ec. I've repeated the test using a replica 3 and a
replica 2 volumes and the problem still happens. The test basically
writes a file to an NFS mount using 'dd'. The file has a size of
1GB. With a smaller file, the test passes successfully.
Using NFS client and gluster NFS server on same machine with BIG file
dd operations is known to cause hangs. anon-fd-quota.t used to give
similar problems so we changed the test to not involve NFS mounts.
I don't re-collect the exact scenario. Avati found the deadlock of
memory allocation, when I just joined gluster, in 2010. Raghavendra Bhat
raised this bug then. CCed him to the thread as well if he knows the
exact scenario.

I've been doing some tests with Shyam and it seems that the root cause is the edge-triggered epoll introduced in the multi-threaded epoll patch. It has a side effect that makes the outstanding-rpc-limit option near to useless and gluster gets overflowed of requests, causing timeouts and disconnections on slow/busy machines.

Elaborating on this, the MT epoll makes the epoll edge triggered (ET), 
and so on an poll in event, we attempt to read as much as we can. If the 
client is able to supply 'n' RPCs till our read, gets a EAGAIN | 
EWOULDBLOCK, we will read them and not honor the server side throttle.

In the previous case, we read RPC by RPC and the epoll was not ET, hence 
when we reached the throttle limit, we stop reading from the socket. The 
network pipes would be filled up when this happens and so the client 
would also not be able to write more RPC, hence outstanding RPC (or 
ongoing RPCs) would be limited.

With the ET case in epoll we are breaking this.

Shyam
_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-devel