EC volume: Bug caused by race condition during rmdir and inodelk

Ashish Pandey <aspandey@xxxxxxxxxx> · Fri, 25 Nov 2016 02:38:11 -0500 (EST)

Hi All,

On  EC volume, we have been seeing an interesting bug caused by fine race between rmdir and inodelk which  leads to EIO error.
Pranith, Xavi and I had a discussion on this and have some possible solution. Your inputs are required on this bug and its possible solution.

1 - Consider  rmdir on /a/b and chown on a/b from 2 different clients/process. rmdir /a/b takes lock on "a" and deletes "b". 
However, chown /a/b will take lock on "b" to do setattr fop. Now, in case of (4+2)  EC volume,  inodelk might get ENOENT from 3 bricks (if rmdir /a/b succeeds on these 3 bricks) and 
might get locks from rest of the 3 bricks. 

As an operation should be successful on at least 4 bricks, it will throw EIO for chown.

This can be solved on EC side while processing callbacks and based on error we can decide which error we should be passed on. In the above case sending 
ENOENT could be safer.

2 -  rmdir /a/b and rmdir /a/b/c comes from 2 different clients/process. 
Now, suppose "c" has been deleted by some other process, rmdir /a/b would be succeeded.
At this point, it is possible that  /a/b has been deleted and the inode for "b" has been purged on 3 bricks. At time the inodelk on "b" comes for rmdir /a/b/c.
It will fail on 3 bricks and gets lock on rest of the 3. In this case again, we gets EIO.

To solve this, It was suggested to take lock on parent as well as on entry which is to be deleted. So in the above case when we do rmdir /a/b/c we will take locks 
on "b" and "c" both. For rmdir /a/b we will take lock on "a" and "b". This will certainly impact performance but at this moment this looks feasible solution.

----
Ashish

_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-devel