----- Original Message ----- > From: "Raghavendra Gowdappa" <rgowdapp@xxxxxxxxxx> > To: "Vijay Bellur" <vbellur@xxxxxxxxxx> > Cc: "Gluster Devel" <gluster-devel@xxxxxxxxxxx> > Sent: Thursday, February 4, 2016 11:28:29 AM > Subject: Re: Non-blocking lock for renames > > > > ----- Original Message ----- > > From: "Vijay Bellur" <vbellur@xxxxxxxxxx> > > To: "Shyamsundar Ranganathan" <srangana@xxxxxxxxxx>, "Raghavendra Gowdappa" > > <rgowdapp@xxxxxxxxxx> > > Cc: "Gluster Devel" <gluster-devel@xxxxxxxxxxx> > > Sent: Thursday, February 4, 2016 9:55:04 AM > > Subject: Non-blocking lock for renames > > > > DHT developers, > > > > We introduced a non-blocking lock prior to a rename operation in dht and > > fail the rename if the lock acquisition is not successful with 3.6. I > > ran into an user in IRC yesterday who is affected by this behavior change: > > > > "We're seeing a behavior in Gluster 3.7.x that we did not see in 3.4.x > > and we're not sure how to fix it. When multiple processes are attempting > > to rename a file to the same destination at once, we're now seeing > > "Device or resource busy" and "Stale file handle" errors. Here's the > > command to replicate it: cd /mnt/glustermount; while true; do > > FILE=$RANDOM; touch $FILE; mv $FILE file-fv; done. The above command > > would be ran on two or three servers within the same gluster cluster. In > > the output, one would always be sucessfull in the rename, while the 2 > > other ones would fail with the above error." > > > > The use case for concurrent renames was described as: > > > > "we generate files and push them to the gluster cluster. Some are > > generated multiple times and end up being pushed to the cluster at the > > same time by different data generators; resulting in the 'rename > > collision'. We use also the cluster.extra-hash-regex to make sure the > > data is written in place. And this does the rename." > > > > Is a non-blocking lock essential? Can we not use a blocking lock instead > > of a non-blocking lock or fallback to a blocking lock if the original > > non-blocking lock acquisition fails? > > This lock synchronizes: > 1. rename from application with file migration from rebalance process [1]. > 2. multiple renames from application on same file. > > I think lock is still required for 1. However, since migration can > potentially take large time, we chose a non-blocking lock to make sure > application is not blocked for longer period. > > The case 2 is what causing the issue mentioned in this thread. We did see > some files being removed with parallel renames on the same file. But, by the > time we had identified that its a bug in 'mv' (mv issues an unlink on src if > src and dst happens to be hardlinks [2]. But test for hardlink check and > unlink are not atomic. Dht breaks rename into a series of links and > unlinks), we had introduced synchronizing b/w renames. So, we have two > options: > > 1. Use different domains for use cases 1 and 2 above. With different domains, > use-case 2 above can be changed to use blocking locks. It might not be > advisable to use blocking locks for use-case 1. > 2. Since we identified the issue is with mv (I couldn't find another bug we > filed on mv, but [2] is close to it), probably we don't need locking in 2 at > all. > > Suggestions? > > [1] https://bugzilla.redhat.com/show_bug.cgi?id=969298#c8 > [2] https://bugzilla.redhat.com/show_bug.cgi?id=438076 Found the bug, we had filed on mv: [2] https://bugzilla.redhat.com/show_bug.cgi?id=1141368 > > regards, > Raghavendra > > > > Thanks, > > Vijay > > > > > > > > > _______________________________________________ > Gluster-devel mailing list > Gluster-devel@xxxxxxxxxxx > http://www.gluster.org/mailman/listinfo/gluster-devel > _______________________________________________ Gluster-devel mailing list Gluster-devel@xxxxxxxxxxx http://www.gluster.org/mailman/listinfo/gluster-devel