On Thu, Jul 07, 2016 at 06:30:03PM +0900, Byungchul Park wrote: > Crossrelease feature introduces new concept and data structure. Thus > the document is necessary. So added it. Any opinions about this suggestion? > > Signed-off-by: Byungchul Park <byungchul.park@xxxxxxx> > --- > Documentation/locking/crossrelease.txt | 457 +++++++++++++++++++++++++++++++++ > 1 file changed, 457 insertions(+) > create mode 100644 Documentation/locking/crossrelease.txt > > diff --git a/Documentation/locking/crossrelease.txt b/Documentation/locking/crossrelease.txt > new file mode 100644 > index 0000000..51b583b > --- /dev/null > +++ b/Documentation/locking/crossrelease.txt > @@ -0,0 +1,457 @@ > +Crossrelease dependency check > +============================= > + > +Started by Byungchul Park <byungchul.park@xxxxxxx> > + > +Contents: > + > + (*) Introduction. > + > + - What the lockdep checks. > + > + (*) What is a problem? > + > + - Examples. > + - Lockdep's assumption. > + - Lockdep's limitation. > + > + (*) How to solve the problem. > + > + - What causes deadlock? > + - Relax the assumption. > + - Relax the limitation. > + > + (*) Implementation. > + > + - Introduce crosslock. > + - Introduce commit stage. > + - Acquire vs commit vs release. > + - Data structures. > + > + (*) Optimizations. > + > + > +============ > +Introduction > +============ > + > +What the lockdep checks > +----------------------- > + > +Lockdep checks dependency between locks and reports it if a deadlock > +possibility is detected in advance or if an actual deadlock occures at > +the time checking it. > + > +The former is more valuable because the possibility detection without > +lockdep is much harder. When an actual deadlock occures, we can identify > +what happens in the system by some means or other, even without lockdep. > + > +It checks dependency between locks (exactly lock classes, see > +Documentation/locking/lockdep-design.txt for more) and builds the > +relationship between them when the lock is acquired. Eventually it forms > +the global dependency graph which connects between related locks based > +on dependency they have, like e.g. > + > +A - B - - F - G > + \ / > + - E - - L > + / \ / > +C - D - - H - > + \ > + - I - K > + / > + J - > + > +where A, B, C,..., L are different lock classes. > + > +Lockdep basically works based on this global dependency graph. The more > +nodes it has, the stronger check can be performed. > + > +For example, lockdep can detect a deadlock when acquiring a C class lock > +while it already held a K class lock, because it forms a cycle which > +means deadlock, like > + > +A - B - - F - G > + \ / > + - E - - L > + / \ / > +C - D - - H - > +| \ > +| - I - K > +| / | > +| J - | > ++-------------------------+ > + > +where A, B, C,..., L are different lock classes. > + > +If any of nodes making up the cycle was not built into the graph, we > +cannot detect this kind of deadlock because there's no cycle. For > +example, it might be > + > +A - B - - F - G > + \ / > + - E - L > + / > +C - D - > +| > +| - I - K > +| / | > +| J - | > ++-------------------------+ > + > +where A, B, C,..., L are different lock classes. > + > +Adding nodes and edges gives us additional opportunity to check if it > +causes deadlock or not, so makes the detection stronger. > + > + > +================= > +What is a problem > +================= > + > +Examples > +-------- > + > +Can we detect deadlocks descriped below, with lockdep? No. > + > +Example 1) > + > + PROCESS X PROCESS Y > + -------------- -------------- > + mutext_lock A > + lock_page B > + lock_page B > + mutext_lock A // DEADLOCK > + unlock_page B > + mutext_unlock A > + mutex_unlock A > + unlock_page B > + > +We are not checking the dependency for lock_page() at all now. > + > +Example 2) > + > + PROCESS X PROCESS Y PROCESS Z > + -------------- -------------- -------------- > + mutex_lock A > + lock_page B > + lock_page B > + mutext_lock A // DEADLOCK > + mutext_unlock A > + unlock_page B > + (B was held by PROCESS X) > + unlock_page B > + mutex_unlock A > + > +We cannot detect this kind of deadlock with lockdep, even though we > +apply the dependency check using lockdep on lock_page(). > + > +Example 3) > + > + PROCESS X PROCESS Y > + -------------- -------------- > + mutex_lock A > + mutex_lock A > + mutex_unlock A > + wait_for_complete B // DEADLOCK > + complete B > + mutex_unlock A > + > +wait_for_complete() and complete() also can cause a deadlock, however > +we cannot detect it with lockdep, either. > + > + > +lockdep's assumption > +-------------------- > + > +Lockdep (not crossrelease featured one) assumes that, > + > + A lock will be unlocked within the context holding the lock. > + > +Thus we can say that a lock has dependency with all locks being already > +in held_locks. So we can check dependency and build the dependency graph > +when acquiring it. > + > + > +lockdep's limitation > +-------------------- > + > +It can be applied only to typical lock operations, e.g. spin_lock, > +mutex and so on. However, e.g. lock_page() cannot play with the lockdep > +thingy because it violates assumptions of lockdep, even though it's > +considered as a lock for the page access. > + > +In the view point of lockdep, it must be released within the context > +which held the lock, however, the page lock can be released by different > +context from the context which held it. Another example, > +wait_for_complete() and complete() are also the case by nature, which > +the lockdep cannot deal with. > + > + > +======================== > +How to solve the problem > +======================== > + > +What causes deadlock > +-------------------- > + > +Not only lock operations, but also any operations causing to wait or > +spin for something can cause deadlock unless it's eventually *released* > +by someone. The important point here is that the waiting or spinning > +must be *released* by someone. In other words, we have to focus whether > +the waiting or spinning can be *released* or not to check a deadlock > +possibility, rather than the waiting or spinning itself. > + > +In this point of view, typical lock is a special case where the acquire > +context is same as the release context, so no matter in which context > +the checking is performed for typical lock. > + > +Of course, in order to be able to report deadlock imediately at the time > +real deadlock actually occures, the checking must be performed before > +actual blocking or spinning happens when acquiring it. However, deadlock > +*possibility* can be detected and reported even the checking is done > +when releasing it, which means the time we can identify the release > +context. > + > + > +Relax the assumption > +-------------------- > + > +Since whether waiting or spinning can be released or not is more > +important to check deadlock possibility as decribed above, we can relax > +the assumtion the lockdep has. > + > + A lock can be unlocked in any context, unless the context > + itself causes a deadlock e.g. acquiring a lock in irq-safe > + context before releasing the lock in irq-unsafe context. > + > +A lock has dependency with all locks in the release context, having been > +held since the lock was held. Thus basically we can check the dependency > +only after we identify the release context at first. Of course, we can > +identify the release context at the time acquiring a lock for typical > +lock because the acquire context is same as the release context. > + > +However, generally if we cannot identify the release context at the time > +acquiring a lock, we have to wait until the lock having been held will > +be eventually released to identify the release context. > + > + > +Relax the limitation > +-------------------- > + > +Given that the assumption is relaxed, we can check dependency and detect > +deadlock possibility not only for typical lock, but also for lock_page() > +using PG_locked, wait_for_xxx() and so on which might be released by > +different context from the context which held the lock. > + > + > +============== > +Implementation > +============== > + > +Introduce crosslock > +------------------- > + > +Crossrelease feature names a lock "crosslock" if it is releasable by a > +different context from the context which held the lock. All locks > +having been held in the context unlocking the crosslock, since the > +crosslock was held until eventually it's unlocked, have dependency with > +the crosslock. That's the key idea to implement crossrelease feature. > + > + > +Introduce commit stage > +---------------------- > + > +Crossrelease feature names it "commit", to check dependency and build > +the dependency graph and chain in batches. Lockdep already performs what > +the commit does, when acquiring a lock. This way it works for typical > +lock, since it's guarrented that the acquire context is same as the > +release context for that. However, that way must be changed for > +crosslock so that it identify the release context when releasing the > +crosslock and then performs the commit. > + > +Let's demonstrate it though several examples. > + > +The below is how the current lockdep works for typical lock. Note that > +context 1 == context 2 for typical lock. > + > + CONTEXT 1 CONTEXT 2 > + acquiring A releasing A > + ------------- ------------- > + acquire A acquire A > + > + acquire B acquire B -> build "A - B" > + > + acquire C acquire C -> build "A - C" > + > + release C release C > + > + release B release B > + > + release A release A > + > +After building "A - B", the dependency graph forms like, > + > +A - B > + > +And after building "A - C", the dependency graph forms like, > + > + - B > + / > +A - > + \ > + - C > + > +What if we apply the commit to lockdep for typical lock? Of course, it's > +not necessary for typical lock. Just a example for what the commit does. > + > + CONTEXT 1 CONTEXT 2 > + acquiring A releasing A > + ------------- ------------- > + acquire A acquire A -> mark A started > + > + acquire B acquire B -> queue B > + > + acquire C acquire C -> queue C > + > + release C release C > + > + release B release B > + > + release A release A -> commit A (build "A - B", "A - C") > + > +After commiting A, the dependency graph forms like, at a time > + > + - B > + / > +A - > + \ > + - C > + > +Here we can see both the former and the latter end in building a same > +dependency graph consisting of "A - B" and "A - C". Of course, the > +former can build the graph earlier than the latter, which means the > +former can detect a deadlock sooner, maybe, as soon as possible. So the > +former would be prefered if possible. > + > +Let's look at the way the commit works for crosslock. > + > + CONTEXT 1 CONTEXT 2 > + acquiring A releasing A > + ------------- ------------- > + lock A > + acquire A -> mark A started > + > + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <- serialized by some means > + > + acquire X -> queue X > + lock X > + acquire B > + release X > + acquire C > + acquire Y -> queue Y > + release C > + release Y > + release B > + release A -> commit A (build "A - X", "A - Y") > + > +where A is a crosslock and the others are typical locks. > + > +Since a crosslock is held, it starts to queue all candidates whenever > +acquiring typical lock, and keep it until finally it will be released. > +Then it can commit all proper candidates queued until now. In other > +words, it checks dependency and builds the dependency graph and chain > +at the commit stage. > + > +And it is serialized so that the sequence, A -> X -> Y can be seen, > +using atomic operations and proper barriers. > + > + > +Acquire vs commit vs release > +---------------------------- > + > +What the lockdep should do in each stage to make it work even for > +crosslock is like, (Note that it does not change any current logic > +by which lockdep works, but only adds additional detection capability.) > + > +1. Acquire > + > + 1) For typical lock > + > + It's queued into additional data structure so that the > + commit stage can check depedndency and build the > + dependency graph and chain with this later. > + > + 2) For crosslock > + > + It's added to the global linked list so that the commit > + stage can check dependency and build the dependency > + graph and chain with this later. > + > +2. Commit > + > + 1) For typical lock > + > + N/A. > + > + 2) For crosslock > + > + It checks dependency and builds the dependency graph and > + chain with data saved in the acquire stage. Here, we > + establish dependency between the crosslock we are > + unlocking now and all locks in that context, having been > + held since the lock was held. Of course, it avoids > + unnecessary checking and building as far as possible. > + > +3. Release > + > + 1) For typical lock > + > + No change. > + > + 2) For crosslock > + > + Just remove this crosslock from the global linked list, > + to which the crosslock was added at acquire stage. > + Release operation should be used with commit operation > + together for crosslock. > + > + > +Data structures > +--------------- > + > +Crossrelease feature introduces two new data structures. > + > +1. pend_lock (or plock) > + > + This is for keeping locks waiting to be committed so that the > + actual dependency graph and chain can be built in the commit > + stage. Every task_struct has a pend_lock array to keep these > + locks. pend_lock entry will be consumed and filled whenever > + lock_acquire() is called for typical lock and will be flushed, > + namely committed at proper time. And this array is managed by > + circular buffer mean. > + > +2. cross_lock (or xlock) > + > + This keeps some additional data only for crosslock. One > + instance exists per one crosslock's lockdep_map. > + lockdep_init_map_crosslock() should be used instead of > + lockdep_init_map() to use a lock as a crosslock. > + > + > +============= > +Optimizations > +============= > + > +Adding a pend_lock is an operation which very frequently happened > +because it happens whenever a typical lock is acquired. So the operation > +is implemented locklessly using rcu mechanism if possible. Unfortunitly, > +we cannot apply this optimization if any object managed by rcu e.g. > +xlock is on stack or somewhere else where it can be freed or destroyed > +unpredictably. > + > +And chain cache for crosslock is also used to avoid unnecessary checking > +and building dependency, like how the lockdep is already doing for that > +purpose for typical lock. > + > -- > 1.9.1 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>