Re: [PATCH] locks: try to catch potential deadlock between file-private and classic locks from same process

Jeff Layton <jlayton@xxxxxxxxxx> · Tue, 4 Mar 2014 16:21:52 -0500

On Tue, 4 Mar 2014 15:52:47 -0500
Trond Myklebust <trond.myklebust@xxxxxxxxxxxxxxx> wrote:

> 
> On Mar 4, 2014, at 15:37, Jeff Layton <jlayton@xxxxxxxxxx> wrote:
> 
> > On Tue, 4 Mar 2014 12:19:44 -0800
> > Andy Lutomirski <luto@xxxxxxxxxxxxxx> wrote:
> > 
> >> On Tue, Mar 4, 2014 at 12:14 PM, Jeff Layton <jlayton@xxxxxxxxxx> wrote:
> >>> On Tue, 4 Mar 2014 14:35:51 -0500
> >>> "J. Bruce Fields" <bfields@xxxxxxxxxxxx> wrote:
> >>> 
> >>>> On Tue, Mar 04, 2014 at 02:10:49PM -0500, Jeff Layton wrote:
> >>>>> My expectation is that programs shouldn't mix classic and file-private
> >>>>> locks, but Glenn Skinner pointed out to me that that may occur at times
> >>>>> even if the programmer isn't aware.
> >>>>> 
> >>>>> Suppose we have a program that uses file-private locks. That program
> >>>>> then links in a library that uses classic POSIX locks. If those locks
> >>>>> end up conflicting and one is using blocking locks, then the program
> >>>>> could end up deadlocked.
> >>>>> 
> >>>>> Try to catch this situation in posix_locks_deadlock by looking for the
> >>>>> case where the blocking lock was set by the same process but has a
> >>>>> different type, and have the kernel return EDEADLK if that occurs.
> >>>>> 
> >>>>> This check is not perfect. You could (in principle) have a threaded
> >>>>> process that is using classic locks in one thread and file-private locks
> >>>>> in another. That's not necessarily a deadlockable situation but this
> >>>>> check would cause an EDEADLK return in that case.
> >>>>> 
> >>>>> By the same token, you could also have a file-private lock that was
> >>>>> inherited across a fork(). If the inheriting process ends up blocking on
> >>>>> that while trying to set a classic POSIX lock then this check would miss
> >>>>> it and the program would deadlock.
> >>>> 
> >>>> If the caller's not prepared for the library to use classic posix locks,
> >>>> then it's not going to know how to recover from this EDEADLCK either, is
> >>>> it?
> >>>> 
> >>> 
> >>> Well, callers should be aware of that if we take this change. The
> >>> semantics aren't yet set in stone...
> >>> 
> >>>> I guess I don't understand how this helps anyone.
> >>>> 
> >>>> Has it ever made sense for a library function and its caller to both use
> >>>> classic posix locking on the same file without any coordination?
> >>>> 
> >>> 
> >>> Not really, but that doesn't mean that it isn't done... ;)
> >>> 
> >>>> Besides the first-close problem there's the problem that locks merge, so
> >>>> for example you can't hold your own lock across a call to a function
> >>>> that grabs and drops a lock on the same file.
> >>>> 
> >>> 
> >>> It depends, but you're basically correct...
> >>> 
> >>> It's likely that if the above situation occurred with a program using
> >>> classic locks, then those locks were silently lost at times. It's also
> >>> plausible that when it occurs that no one is aware of it due to the way
> >>> POSIX locks work.
> >>> 
> >>> If the program switched to using file-private locks and the library
> >>> stays using classic locks (or vice versa), you then potentially trade
> >>> that silent loss of locks for a deadlock (since classic and
> >>> file-private locks always conflict).
> >>> 
> >>> So, the idea would be to try to catch that situation explicitly and
> >>> return a hard error instead of deadlocking. Unfortunately, it's a
> >>> little tough to do that in all cases so all this does is try to catch a
> >>> subset of them.
> >>> 
> >>> Will it be helpful in the long run? I'm not sure. It seems unlikely to
> >>> harm legit use cases though, and might catch some problematic
> >>> situations. I can drop this if that's the consensus however.
> >> 
> >> I don't think I like it except in the case where there are no threads
> >> (number of tasks sharing the fd table is 1) and where the struct file
> >> only has one fd.  Otherwise I think it can have false positives.  Or
> >> am I missing something?
> >> 
> > 
> > The only case where I think this would hit a false positive is if you
> > have a threaded program that's doing something weird like having one
> > thread that's setting classic POSIX locks on a file, and one thread
> > that isn't. Once you hit a conflict between the two, you'd get back
> > EDEADLK on one of them, even though that situation might not actually
> > be a deadlock.
> > 
> > That doesn't really seem like a real-world use-case though, so I'm
> > generally OK with that potential false-positive.
> > 
> 
> How do these locks interact with locks_mandatory_area(), and mandatory locking in general? Unless I missed something, it looks to me as if there is a nasty potential for a self-DOS if you set a file-private lock on a file with the mandatory lock bits set and the filesystem is mounted ‘-omand'.
> 

Good catch. I hadn't considered that case properly...

Looks like I'll have to fix up locks_mandatory_area() to handle the
file-private case. The fact that we'll now have to check for two
different lock types makes that a bit more convoluted, but I'll see
what can be done.

Thanks,
-- 
Jeff Layton <jlayton@xxxxxxxxxx>
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html