On Thu, Jan 09, 2014 at 04:58:59PM -0800, Andy Lutomirski wrote: > On Thu, Jan 9, 2014 at 4:49 PM, Jeff Layton <jlayton@xxxxxxxxxx> wrote: > > On Thu, 09 Jan 2014 12:25:25 -0800 > > Andy Lutomirski <luto@xxxxxxxxxxxxxx> wrote: > >> When I think of deadlocks caused by r/w locks (which these are), I think > >> of two kinds. First is what the current code tries to detect: two > >> processes that are each waiting for each other. I don't know whether > >> POSIX enshrines the idea of detecting that, but I wouldn't be surprised, > >> considering how awful the old POSIX locks are. ... > >> The sensible kind of detectable deadlock involves just one lock, and it > >> happens when two processes both hold read locks and try to upgrade to > >> write locks. This should be efficiently detectable and makes upgrading > >> locks safe(r). This also involves two processes waiting on each other, and the current code should detect either case equally well. ... > For this kind of deadlock detection, nothing global is needed -- I'm > only talking about detecting deadlocks due to two tasks upgrading > locks on the same file (with overlapping ranges) at the same time. > > This is actually useful for SQL-like things. Imagine this scenario: > > Program 1: > > Open a file > BEGIN; > SELECT whatever; -- acquires a read lock > > Program 2: > > Open the same file > BEGIN; > SELECT whatever; -- acquires a read lock > > Program 1: > UPDATE something; -- upgrades to write > > Now program 1 is waiting for program 2 to release its lock. But if > program 2 tries to UPDATE, then it deadlocks. A friendly MySQL > implementation (which, sadly, does not include sqlite) will fail the > abort the transaction instead. And then I suppose you'd need to get an exclusive lock when you retry, to guarantee forward progress in the face of multiple processes retrying at once. I don't know, is this so useful? > It would be nice if the kernel > supported this. > > Note that unlocking and then re-locking for write is incorrect -- it > would allow program 2 to write inconsistent data. > > I think that implementing this could be as simple as having some way > to check if a struct file_lock is currently trying to upgrade from > read to write and, if you try to upgrade and end up waiting for such a > lock, aborting. You have to be clear what you mean by "such a lock". What you really want to know is whether you'd be waiting on a lock that might be waiting on a lock you hold. To a first approximation, the current works with a graph with tasks as nodes and an arrow from node X to node Y if X is waiting on a lock held by node Y. And it follows arrows in that graph looking for cycles. And sure I guess it would be a bit nicer if you only bothered checking for cycles that touch this one file. But I'd really rather avoid the complication of deadlock detection unless somebody can make a really strong case that they need it. > The nasty case, though, is if you try to write-lock a > range while holding a read-lock on only part of the range -- you could > end up acquiring part of the range and deadlocking on the rest. Now > you need to remember enough state to be able to abort. We wait until the entire lock can be applied, and then apply it all atomically (under i_lock). > (Actually, what happens if you receive a signal which waiting on a file lock?) Return -EINTR. > I would personally be okay with removing the existing deadlock > detector entirely. I wouldn't be surprised if no one relies on it. I'd be in favor. --b. -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html