Hi, We have pushed a patch for fop serialization on server side [1]. If you have some time, please take a look into the patch. You are reviews are most welcome :) If I can accommodate all the comments by End of the week, we are planning to get this before the coming Friday. Note: Meantime I will be working to get the performance numbers to see how much performance drop can it cause. [1] : http://review.gluster.org/13451 Regards Rafi KC On 08/19/2015 02:55 PM, Pranith Kumar Karampuri wrote: > + Ravi, Anuradha > > On 08/17/2015 10:39 AM, Raghavendra Gowdappa wrote: >> All, >> >> Pranith and me were discussing about implementation of compound >> operations like "create + lock", "mkdir + lock", "open + lock" etc. >> These operations are useful in situations like: >> >> 1. To prevent locking on all subvols during directory creation as >> part of self heal in dht. Currently we are following approach of >> locking _all_ subvols by both rmdir and lookup-heal [1]. >> 2. To lock a file in advance so that there is less performance hit >> during transactions in afr. >> >> While thinking about implementing such compound operations, it >> occurred to me that one of the problems would be how do we handle a >> racing mkdir/create and a (named lookup - simply referred as lookup >> from now on - followed by lock). This is because, >> 1. creation of directory/file on backend >> 2. linking of the inode with the gfid corresponding to that >> file/directory >> >> are not atomic. It is not guaranteed that inode passed down during >> mkdir/create call need not be the one that survives in inode table. >> Since posix-locks xlator maintains all the lock-state in inode, it >> would be a problem if a different inode is linked in inode table than >> the one passed during mkdir/create. One way to solve this problem is >> to serialize fops (like mkdir/create, lookup, rename, rmdir, unlink) >> that are happening on a particular dentry. This serialization would >> also solve other bugs like: >> >> 1. issues solved by [2][3] and possibly many such issues. >> 2. Stale dentries left out in bricks' inode table because of a racing >> lookup and dentry modification ops (like rmdir, unlink, rename etc). >> >> Initial idea I've now is to maintain fops in-progress on a dentry in >> parent inode (may be resolver code in protocol/server). Based on this >> we can serialize the operations. Since we need to serialize _only_ >> operations on a dentry (we don't serialize nameless lookups), it is >> guaranteed that we do have a parent inode always. Any >> comments/discussion on this would be appreciated. >> >> [1] http://review.gluster.org/11725 >> [2] http://review.gluster.org/9913 >> [3] http://review.gluster.org/5240 >> >> regards, >> Raghavendra. > > _______________________________________________ > Gluster-devel mailing list > Gluster-devel@xxxxxxxxxxx > http://www.gluster.org/mailman/listinfo/gluster-devel _______________________________________________ Gluster-devel mailing list Gluster-devel@xxxxxxxxxxx http://www.gluster.org/mailman/listinfo/gluster-devel