On Tue, 4 Mar 2014 15:40:40 -0500 "J. Bruce Fields" <bfields@xxxxxxxxxxxx> wrote: > On Tue, Mar 04, 2014 at 03:37:23PM -0500, Jeff Layton wrote: > > On Tue, 4 Mar 2014 12:19:44 -0800 > > Andy Lutomirski <luto@xxxxxxxxxxxxxx> wrote: > > > > > On Tue, Mar 4, 2014 at 12:14 PM, Jeff Layton <jlayton@xxxxxxxxxx> wrote: > > > > On Tue, 4 Mar 2014 14:35:51 -0500 > > > > "J. Bruce Fields" <bfields@xxxxxxxxxxxx> wrote: > > > > > > > >> On Tue, Mar 04, 2014 at 02:10:49PM -0500, Jeff Layton wrote: > > > >> > My expectation is that programs shouldn't mix classic and file-private > > > >> > locks, but Glenn Skinner pointed out to me that that may occur at times > > > >> > even if the programmer isn't aware. > > > >> > > > > >> > Suppose we have a program that uses file-private locks. That program > > > >> > then links in a library that uses classic POSIX locks. If those locks > > > >> > end up conflicting and one is using blocking locks, then the program > > > >> > could end up deadlocked. > > > >> > > > > >> > Try to catch this situation in posix_locks_deadlock by looking for the > > > >> > case where the blocking lock was set by the same process but has a > > > >> > different type, and have the kernel return EDEADLK if that occurs. > > > >> > > > > >> > This check is not perfect. You could (in principle) have a threaded > > > >> > process that is using classic locks in one thread and file-private locks > > > >> > in another. That's not necessarily a deadlockable situation but this > > > >> > check would cause an EDEADLK return in that case. > > > >> > > > > >> > By the same token, you could also have a file-private lock that was > > > >> > inherited across a fork(). If the inheriting process ends up blocking on > > > >> > that while trying to set a classic POSIX lock then this check would miss > > > >> > it and the program would deadlock. > > > >> > > > >> If the caller's not prepared for the library to use classic posix locks, > > > >> then it's not going to know how to recover from this EDEADLCK either, is > > > >> it? > > > >> > > > > > > > > Well, callers should be aware of that if we take this change. The > > > > semantics aren't yet set in stone... > > > > > > > >> I guess I don't understand how this helps anyone. > > > >> > > > >> Has it ever made sense for a library function and its caller to both use > > > >> classic posix locking on the same file without any coordination? > > > >> > > > > > > > > Not really, but that doesn't mean that it isn't done... ;) > > > > > > > >> Besides the first-close problem there's the problem that locks merge, so > > > >> for example you can't hold your own lock across a call to a function > > > >> that grabs and drops a lock on the same file. > > > >> > > > > > > > > It depends, but you're basically correct... > > > > > > > > It's likely that if the above situation occurred with a program using > > > > classic locks, then those locks were silently lost at times. It's also > > > > plausible that when it occurs that no one is aware of it due to the way > > > > POSIX locks work. > > > > > > > > If the program switched to using file-private locks and the library > > > > stays using classic locks (or vice versa), you then potentially trade > > > > that silent loss of locks for a deadlock (since classic and > > > > file-private locks always conflict). > > > > > > > > So, the idea would be to try to catch that situation explicitly and > > > > return a hard error instead of deadlocking. Unfortunately, it's a > > > > little tough to do that in all cases so all this does is try to catch a > > > > subset of them. > > > > > > > > Will it be helpful in the long run? I'm not sure. It seems unlikely to > > > > harm legit use cases though, and might catch some problematic > > > > situations. I can drop this if that's the consensus however. > > > > > > I don't think I like it except in the case where there are no threads > > > (number of tasks sharing the fd table is 1) and where the struct file > > > only has one fd. Otherwise I think it can have false positives. Or > > > am I missing something? > > > > > > > The only case where I think this would hit a false positive is if you > > have a threaded program that's doing something weird like having one > > thread that's setting classic POSIX locks on a file, and one thread > > that isn't. Once you hit a conflict between the two, you'd get back > > EDEADLK on one of them, even though that situation might not actually > > be a deadlock. > > > > That doesn't really seem like a real-world use-case though, so I'm > > generally OK with that potential false-positive. > > Yes, you may be correct that those are almost certainly abuses of the > interface, but I think Andy's point is that EDEADLK doesn't mean "you're > doing something wrong", it has a stricter definition, and you're > catching cases that are "false positives" in the sense that they don't > necessarily identify actual deadlocks. > > --b. Fair enough -- you've convinced me. I'll plan to just drop this patch. Thanks! -- Jeff Layton <jlayton@xxxxxxxxxx> -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html