To put the problem in simple words, A lock is granted by posix-locks xlator even after a directory is deleted on backend. ----- Original Message ----- > From: "Raghavendra Gowdappa" <rgowdapp@xxxxxxxxxx> > To: "Gluster Devel" <gluster-devel@xxxxxxxxxxx> > Cc: "Sakshi Bansal" <sabansal@xxxxxxxxxx> > Sent: Thursday, August 20, 2015 10:31:55 AM > Subject: Re: Locking behavior vs rmdir/unlink of a directory/file > > > > ----- Original Message ----- > > From: "Raghavendra Gowdappa" <rgowdapp@xxxxxxxxxx> > > To: "Gluster Devel" <gluster-devel@xxxxxxxxxxx> > > Cc: "Sakshi Bansal" <sabansal@xxxxxxxxxx> > > Sent: Thursday, August 20, 2015 10:24:46 AM > > Subject: Locking behavior vs rmdir/unlink of a > > directory/file > > > > Hi all, > > > > Most of the code currently treats inode table (and dentry structure > > associated with that) as the correct representative of underlying backend > > file-system. While this is correct for most of the cases, the > > representation > > might be out of sync for small time-windows (like file deleted on disk, but > > dentry and inode is not removed in our inode table etc). While working on > > locking directories in dht for better consistency we ran into one such > > issue. The issue is basically to make rmdir and directory creation during > > dht-selfheal mutually exclusive. The idea is to have a blocking inodelk on > > inode before proceeding with rmdir or directory self-heal. However, > > consider > > following scenario: > > > > 1. (dht_)rmdir acquires a lock. > > 2. lookup-selfheal tries to acquire a lock, but is blocked on lock acquired > > by rmdir. > > 3. rmdir deletes directory and unlocks the lock. Its possible for inode to > > remain in inode table and searchable through gfid till there is a positive > > reference count on it. In this case lock-request (by lookup) and > > granted-lock (to rmdir) makes the inode to remain in inode table even after > > rmdir. > > as both of them have a refcount each on inode. > > > 4. lock request issued by lookup is granted. > > > > Note that at step 4, its still possible rmdir might be in progress from dht > > perspective (it just completed on one node). However, this is precisely the > > situation we wanted to avoid i.e., we wanted to block and fail dht-selfheal > > instead of allowing it to proceed. > > > > In this scenario at step 4, the directory is removed on backend > > file-system, > > but its representation is still present in inode table. We tried to solve > > this by doing a lookup on gfid before granting a lock [1]. However, because > > of [1] > > > > 1. we no longer treat inode table as source of truth as opposed to other > > non-lookup code > > 2. performance hit in terms of a lookup on backend-filesystem for _every_ > > granted lock. This may not be as big considering that there is no network > > call involved. > > > > There are other ways where dht could've avoided above scenario altogether > > with different trade-offs we didn't want to make. Few alternatives would've > > been, > > 1. use entrylk during lookup-selfheal and rmdir. This fits naturally as > > both > > are entry operations. However, dht-selfheal also sets layouts which should > > be synchronized other operations where we don't have name information. > > tl;dr > > we wanted to avoid using entrylk for reasons that are out of scope for this > > problem. > > 2. Use non-blocking inodelk by dht during lookup-selfheal. This solves the > > problem for most of the practical cases, but theoretically race can still > > exist. > > > > To summarize, the problem of granted-locks and unlink/rmdir still remains > > and > > I am not sure what exactly should be the behavior of posix-locks in that > > scenario. Inputs in way of review on [1] are greatly appreciated. > > > > [1] http://review.gluster.org/#/c/11916/ > > > > regards, > > Raghavendra. > > _______________________________________________ > > Gluster-devel mailing list > > Gluster-devel@xxxxxxxxxxx > > http://www.gluster.org/mailman/listinfo/gluster-devel > > > _______________________________________________ > Gluster-devel mailing list > Gluster-devel@xxxxxxxxxxx > http://www.gluster.org/mailman/listinfo/gluster-devel > _______________________________________________ Gluster-devel mailing list Gluster-devel@xxxxxxxxxxx http://www.gluster.org/mailman/listinfo/gluster-devel