On Fri, 2013-07-05 at 12:25 -0400, Bruce Fields wrote: > > > On Fri, 2013-07-05 at 10:04 +0100, Al Viro wrote: > > > > generic_add_lease() with F_WRLCK checks for other openers > > > > in a very crude way - it wants no extra references to dentry (thus > > > > excluding other struct file pointing to it) *and* no extra references > > > > to in-core inode, excluding openers of other links. It fails with > > > > EAGAIN if those conditions are not met. > > > > > > > > The way it deals with another open(2) racing with it (i.e. > > > > managing to squeeze between the check and locks_insert_lock()) is > > > > theoretically racy; do_dentry_open() would spin on ->i_lock, all > > > > right, but... only if there already is something in inode->i_flock. > > > > If this is the first lease/lock being set, break_lease() will do > > > > nothing, rather than call __break_lease() and spin there. > > > > > > > > It's _very_ hard to hit; we are holding ->i_lock and thus can't > > > > be preempted, so open(2) would have to get *everything* (pathname > > > > lookup, etc.) done in a very narrow window. So I don't believe it's > > > > exploitable, but it really smells bad. The check is extremely crude > > > > and if nothing else it's a DoS fodder - a luser that keeps hitting that > > > > file with stat(2) can prevent F_SETLEASE from succeeding, even though > > > > he wouldn't be able to open the damn thing at all... > > nfsd isn't using write leases yet (I want to get read delegations sorted > out first), and I don't understand Samba's requirements for write > leases. > > In the future, when nfsd does write delegations: they're an optional > optimization, and if we're concerned about such a DOS one solution might > be just to change everything to a trylock when acquiring delegations: > it's always acceptable to just fail the delegation. > I think that's the case for Samba too. Oplocks are very similar to delegations, though there are of course some semantic differences. > On Fri, Jul 05, 2013 at 08:08:44AM -0400, Jeff Layton wrote: > > On Fri, 2013-07-05 at 06:51 -0400, Jeff Layton wrote: > > > We already hold an extra reference to the dentry via the path_get in > > > do_dentry_open. So is this race possible if the two tasks are working on > > > the same dentry? > > Well, as Al says, the race is roughly: > > take i_lock > generic_add_lease checks d_count, i_count > > Conflicting opener does "*everything* (pathname lookup, > etc.)" (including that path_get, and breake_lease() > (which sees no lock, so doesn't try to get i_lock.)) > > ... > locks_insert_lock() adds new lock. > Right, I realized that while I was driving earlier. This race is definitely possible in either case... > > > Or does it require a hardlinked inode? > > > > > > If it's not possible to race on the same dentry, then one possible fix > > > would be to go ahead and do an extra igrab(inode) > > I think the only reason for this race is the attempt to optimize out an > i_lock acquisition in break_lease. If we're willing to call igrab > (which also takes the i_lock), then we may as well just take the i_lock. > I had misremembered the function name -- I had meant ihold/iput (which would commonly be lockless). The point is moot though since we have a similar problem in the "same dentry" case anyway. Also the path_get on the other dentry would ensure that you had a i_count > 1... So...disregard my silly rumbling about it. :) > > > in do_dentry_open > > > before calling break_lease. I'm not particularly fond of that since it > > > means taking the i_lock an extra time, but it looks like it would close > > > the race. > > > > Hrm. I think we'd also need to couple that with an extra check for a > > high refcount after doing locks_insert_lock in generic_add_lease, and > > then call locks_delete_lock and return -EAGAIN if the counts have > > changed. > > ... but checking the counts again afterwards might work. (Dumb > question: in the absence of a lock on the opener's side, are the memory > accesses ordered such that a lease-setter is guaranteed to see the new > counts from an opener that didn't see the new i_flock value?) > I'm not sure. We'd certainly need to audit that and ensure that the lease setter does see them before we could rely on such a scheme. > How important is the optimization that skips the i_lock in break_lease? Again, not sure...but I do shy away a bit from anything that would slow down opens in the case where leases aren't involved. I'd prefer any penalty be paid by the setlease caller. Opens are much more common than leases in almost every workload I can think of. I think the bigger issue though is that looking at refcounts in order to determine when we have a conflicting open is just plain wrong. There are all sorts of reasons one might see a raised refcount that don't involve conflicting opens (Al's stat() example for instance). It seems like we ought to shoot for a solution that doesn't rely (solely) on inode and dentry refcounts. -- Jeff Layton <jlayton@xxxxxxxxxx> -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html