Re: Locking behavior vs rmdir/unlink of a directory/file

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




----- Original Message -----
> From: "Vijay Bellur" <vbellur@xxxxxxxxxx>
> To: "Raghavendra Gowdappa" <rgowdapp@xxxxxxxxxx>, "Gluster Devel" <gluster-devel@xxxxxxxxxxx>
> Cc: "Sakshi Bansal" <sabansal@xxxxxxxxxx>
> Sent: Monday, August 24, 2015 3:52:09 PM
> Subject: Re:  Locking behavior vs rmdir/unlink of a directory/file
> 
> On Thursday 20 August 2015 10:24 AM, Raghavendra Gowdappa wrote:
> > Hi all,
> >
> > Most of the code currently treats inode table (and dentry structure
> > associated with that) as the correct representative of underlying backend
> > file-system. While this is correct for most of the cases, the
> > representation might be out of sync for small time-windows (like file
> > deleted on disk, but dentry and inode is not removed in our inode table
> > etc). While working on locking directories in dht for better consistency
> > we ran into one such issue. The issue is basically to make rmdir and
> > directory creation during dht-selfheal mutually exclusive. The idea is to
> > have a blocking inodelk on inode before proceeding with rmdir or directory
> > self-heal. However, consider following scenario:
> >
> > 1. (dht_)rmdir acquires a lock.
> > 2. lookup-selfheal tries to acquire a lock, but is blocked on lock acquired
> > by rmdir.
> > 3. rmdir deletes directory and unlocks the lock. Its possible for inode to
> > remain in inode table and searchable through gfid till there is a positive
> > reference count on it. In this case lock-request (by lookup) and
> > granted-lock (to rmdir) makes the inode to remain in inode table even
> > after rmdir.
> > 4. lock request issued by lookup is granted.
> >
> > Note that at step 4, its still possible rmdir might be in progress from dht
> > perspective (it just completed on one node). However, this is precisely
> > the situation we wanted to avoid i.e., we wanted to block and fail
> > dht-selfheal instead of allowing it to proceed.
> >
> > In this scenario at step 4, the directory is removed on backend
> > file-system, but its representation is still present in inode table. We
> > tried to solve this by doing a lookup on gfid before granting a lock [1].
> > However, because of [1]
> >
> > 1. we no longer treat inode table as source of truth as opposed to other
> > non-lookup code
> > 2. performance hit in terms of a lookup on backend-filesystem for _every_
> > granted lock. This may not be as big considering that there is no network
> > call involved.
> >
> 
> Can we not mark the in memory inode as having been unlinked in
> posix_rmdir() and use this information to determine whether a lock
> request can be processed?

Yes. Nithya suggested the same. But this seemed like a hacky fix. Reason is:

1. Currently we don't really differentiate in inode management based on inode type. The code (dentry management, inode management) is agnostic to type. With this fix we are bringing such explicit differentiation. Note that only for directories we can have such a flag indicating that inode is removed from backend. Since, files have hardlinks it would be difficult (if not impossible, as unlink_cbk doesn't carry iatt to figure out whether current unlink is on last link). This makes the solution only applicable for directories. For lock requests on files we still need to lookup on the backend (for our use-case this is fine, since we are not locking on files). Not a show-stopper, but something in terms of aesthetics.

The whole thing about locks during/after file/directory is removed seems to be not well defined as of now IMHO. 
     a. We can acquire lock because inode exists in inode-table.
     b. inode-exists in inode-table because there are some locks on inode holding reference.
   
   Of course, this situation will be fixed with maintaining a flag indicating file/directory removal (or by doing a lookup). But some clarifications needed - If we are going to have some information indicating file/directory is removed, what should be the behaviour of future lock/unlock calls? should we fail them? For lock calls, we can fail them. But for unlock there are two choices:
     a. Let the consumer send an unlock even after remove
     b. Or clear out the locks during unlink/rmdir.

    I prefer approach a. here. Comments?
> 
> stat() calls can be significantly expensive if the disk seek times
> happen to be high. It would be better if we can avoid an additional
> stat() for every granted lock.
> 
> Regards,
> Vijay
> 
> 
_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-devel



[Index of Archives]     [Gluster Users]     [Ceph Users]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux