Re: [PATCH] locks: try to catch potential deadlock between file-private and classic locks from same process

Andy Lutomirski <luto@xxxxxxxxxxxxxx> · Tue, 4 Mar 2014 12:19:44 -0800

On Tue, Mar 4, 2014 at 12:14 PM, Jeff Layton <jlayton@xxxxxxxxxx> wrote:
> On Tue, 4 Mar 2014 14:35:51 -0500
> "J. Bruce Fields" <bfields@xxxxxxxxxxxx> wrote:
>
>> On Tue, Mar 04, 2014 at 02:10:49PM -0500, Jeff Layton wrote:
>> > My expectation is that programs shouldn't mix classic and file-private
>> > locks, but Glenn Skinner pointed out to me that that may occur at times
>> > even if the programmer isn't aware.
>> >
>> > Suppose we have a program that uses file-private locks. That program
>> > then links in a library that uses classic POSIX locks. If those locks
>> > end up conflicting and one is using blocking locks, then the program
>> > could end up deadlocked.
>> >
>> > Try to catch this situation in posix_locks_deadlock by looking for the
>> > case where the blocking lock was set by the same process but has a
>> > different type, and have the kernel return EDEADLK if that occurs.
>> >
>> > This check is not perfect. You could (in principle) have a threaded
>> > process that is using classic locks in one thread and file-private locks
>> > in another. That's not necessarily a deadlockable situation but this
>> > check would cause an EDEADLK return in that case.
>> >
>> > By the same token, you could also have a file-private lock that was
>> > inherited across a fork(). If the inheriting process ends up blocking on
>> > that while trying to set a classic POSIX lock then this check would miss
>> > it and the program would deadlock.
>>
>> If the caller's not prepared for the library to use classic posix locks,
>> then it's not going to know how to recover from this EDEADLCK either, is
>> it?
>>
>
> Well, callers should be aware of that if we take this change. The
> semantics aren't yet set in stone...
>
>> I guess I don't understand how this helps anyone.
>>
>> Has it ever made sense for a library function and its caller to both use
>> classic posix locking on the same file without any coordination?
>>
>
> Not really, but that doesn't mean that it isn't done... ;)
>
>> Besides the first-close problem there's the problem that locks merge, so
>> for example you can't hold your own lock across a call to a function
>> that grabs and drops a lock on the same file.
>>
>
> It depends, but you're basically correct...
>
> It's likely that if the above situation occurred with a program using
> classic locks, then those locks were silently lost at times. It's also
> plausible that when it occurs that no one is aware of it due to the way
> POSIX locks work.
>
> If the program switched to using file-private locks and the library
> stays using classic locks (or vice versa), you then potentially trade
> that silent loss of locks for a deadlock (since classic and
> file-private locks always conflict).
>
> So, the idea would be to try to catch that situation explicitly and
> return a hard error instead of deadlocking. Unfortunately, it's a
> little tough to do that in all cases so all this does is try to catch a
> subset of them.
>
> Will it be helpful in the long run? I'm not sure. It seems unlikely to
> harm legit use cases though, and might catch some problematic
> situations. I can drop this if that's the consensus however.

I don't think I like it except in the case where there are no threads
(number of tasks sharing the fd table is 1) and where the struct file
only has one fd.  Otherwise I think it can have false positives.  Or
am I missing something?

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html