On Mon, Jul 18, 2016 at 10:39:19AM +0900, Byungchul Park wrote: > On Thu, Jul 07, 2016 at 06:30:03PM +0900, Byungchul Park wrote: > > Crossrelease feature introduces new concept and data structure. Thus > > the document is necessary. So added it. > > Any opinions about this suggestion? Just to be sure, I have to enhance this document more yet, however I'm sure I already introduced the general outline through this document. It was the purpose of this premature document which I sent early even though it's not perfect yet. > > > > > Signed-off-by: Byungchul Park <byungchul.park@xxxxxxx> > > --- > > Documentation/locking/crossrelease.txt | 457 +++++++++++++++++++++++++++++++++ > > 1 file changed, 457 insertions(+) > > create mode 100644 Documentation/locking/crossrelease.txt > > > > diff --git a/Documentation/locking/crossrelease.txt b/Documentation/locking/crossrelease.txt > > new file mode 100644 > > index 0000000..51b583b > > --- /dev/null > > +++ b/Documentation/locking/crossrelease.txt > > @@ -0,0 +1,457 @@ > > +Crossrelease dependency check > > +============================= > > + > > +Started by Byungchul Park <byungchul.park@xxxxxxx> > > + > > +Contents: > > + > > + (*) Introduction. > > + > > + - What the lockdep checks. > > + > > + (*) What is a problem? > > + > > + - Examples. > > + - Lockdep's assumption. > > + - Lockdep's limitation. > > + > > + (*) How to solve the problem. > > + > > + - What causes deadlock? > > + - Relax the assumption. > > + - Relax the limitation. > > + > > + (*) Implementation. > > + > > + - Introduce crosslock. > > + - Introduce commit stage. > > + - Acquire vs commit vs release. > > + - Data structures. > > + > > + (*) Optimizations. > > + > > + > > +============ > > +Introduction > > +============ > > + > > +What the lockdep checks > > +----------------------- > > + > > +Lockdep checks dependency between locks and reports it if a deadlock > > +possibility is detected in advance or if an actual deadlock occures at > > +the time checking it. > > + > > +The former is more valuable because the possibility detection without > > +lockdep is much harder. When an actual deadlock occures, we can identify > > +what happens in the system by some means or other, even without lockdep. > > + > > +It checks dependency between locks (exactly lock classes, see > > +Documentation/locking/lockdep-design.txt for more) and builds the > > +relationship between them when the lock is acquired. Eventually it forms > > +the global dependency graph which connects between related locks based > > +on dependency they have, like e.g. > > + > > +A - B - - F - G > > + \ / > > + - E - - L > > + / \ / > > +C - D - - H - > > + \ > > + - I - K > > + / > > + J - > > + > > +where A, B, C,..., L are different lock classes. > > + > > +Lockdep basically works based on this global dependency graph. The more > > +nodes it has, the stronger check can be performed. > > + > > +For example, lockdep can detect a deadlock when acquiring a C class lock > > +while it already held a K class lock, because it forms a cycle which > > +means deadlock, like > > + > > +A - B - - F - G > > + \ / > > + - E - - L > > + / \ / > > +C - D - - H - > > +| \ > > +| - I - K > > +| / | > > +| J - | > > ++-------------------------+ > > + > > +where A, B, C,..., L are different lock classes. > > + > > +If any of nodes making up the cycle was not built into the graph, we > > +cannot detect this kind of deadlock because there's no cycle. For > > +example, it might be > > + > > +A - B - - F - G > > + \ / > > + - E - L > > + / > > +C - D - > > +| > > +| - I - K > > +| / | > > +| J - | > > ++-------------------------+ > > + > > +where A, B, C,..., L are different lock classes. > > + > > +Adding nodes and edges gives us additional opportunity to check if it > > +causes deadlock or not, so makes the detection stronger. > > + > > + > > +================= > > +What is a problem > > +================= > > + > > +Examples > > +-------- > > + > > +Can we detect deadlocks descriped below, with lockdep? No. > > + > > +Example 1) > > + > > + PROCESS X PROCESS Y > > + -------------- -------------- > > + mutext_lock A > > + lock_page B > > + lock_page B > > + mutext_lock A // DEADLOCK > > + unlock_page B > > + mutext_unlock A > > + mutex_unlock A > > + unlock_page B > > + > > +We are not checking the dependency for lock_page() at all now. > > + > > +Example 2) > > + > > + PROCESS X PROCESS Y PROCESS Z > > + -------------- -------------- -------------- > > + mutex_lock A > > + lock_page B > > + lock_page B > > + mutext_lock A // DEADLOCK > > + mutext_unlock A > > + unlock_page B > > + (B was held by PROCESS X) > > + unlock_page B > > + mutex_unlock A > > + > > +We cannot detect this kind of deadlock with lockdep, even though we > > +apply the dependency check using lockdep on lock_page(). > > + > > +Example 3) > > + > > + PROCESS X PROCESS Y > > + -------------- -------------- > > + mutex_lock A > > + mutex_lock A > > + mutex_unlock A > > + wait_for_complete B // DEADLOCK > > + complete B > > + mutex_unlock A > > + > > +wait_for_complete() and complete() also can cause a deadlock, however > > +we cannot detect it with lockdep, either. > > + > > + > > +lockdep's assumption > > +-------------------- > > + > > +Lockdep (not crossrelease featured one) assumes that, > > + > > + A lock will be unlocked within the context holding the lock. > > + > > +Thus we can say that a lock has dependency with all locks being already > > +in held_locks. So we can check dependency and build the dependency graph > > +when acquiring it. > > + > > + > > +lockdep's limitation > > +-------------------- > > + > > +It can be applied only to typical lock operations, e.g. spin_lock, > > +mutex and so on. However, e.g. lock_page() cannot play with the lockdep > > +thingy because it violates assumptions of lockdep, even though it's > > +considered as a lock for the page access. > > + > > +In the view point of lockdep, it must be released within the context > > +which held the lock, however, the page lock can be released by different > > +context from the context which held it. Another example, > > +wait_for_complete() and complete() are also the case by nature, which > > +the lockdep cannot deal with. > > + > > + > > +======================== > > +How to solve the problem > > +======================== > > + > > +What causes deadlock > > +-------------------- > > + > > +Not only lock operations, but also any operations causing to wait or > > +spin for something can cause deadlock unless it's eventually *released* > > +by someone. The important point here is that the waiting or spinning > > +must be *released* by someone. In other words, we have to focus whether > > +the waiting or spinning can be *released* or not to check a deadlock > > +possibility, rather than the waiting or spinning itself. > > + > > +In this point of view, typical lock is a special case where the acquire > > +context is same as the release context, so no matter in which context > > +the checking is performed for typical lock. > > + > > +Of course, in order to be able to report deadlock imediately at the time > > +real deadlock actually occures, the checking must be performed before > > +actual blocking or spinning happens when acquiring it. However, deadlock > > +*possibility* can be detected and reported even the checking is done > > +when releasing it, which means the time we can identify the release > > +context. > > + > > + > > +Relax the assumption > > +-------------------- > > + > > +Since whether waiting or spinning can be released or not is more > > +important to check deadlock possibility as decribed above, we can relax > > +the assumtion the lockdep has. > > + > > + A lock can be unlocked in any context, unless the context > > + itself causes a deadlock e.g. acquiring a lock in irq-safe > > + context before releasing the lock in irq-unsafe context. > > + > > +A lock has dependency with all locks in the release context, having been > > +held since the lock was held. Thus basically we can check the dependency > > +only after we identify the release context at first. Of course, we can > > +identify the release context at the time acquiring a lock for typical > > +lock because the acquire context is same as the release context. > > + > > +However, generally if we cannot identify the release context at the time > > +acquiring a lock, we have to wait until the lock having been held will > > +be eventually released to identify the release context. > > + > > + > > +Relax the limitation > > +-------------------- > > + > > +Given that the assumption is relaxed, we can check dependency and detect > > +deadlock possibility not only for typical lock, but also for lock_page() > > +using PG_locked, wait_for_xxx() and so on which might be released by > > +different context from the context which held the lock. > > + > > + > > +============== > > +Implementation > > +============== > > + > > +Introduce crosslock > > +------------------- > > + > > +Crossrelease feature names a lock "crosslock" if it is releasable by a > > +different context from the context which held the lock. All locks > > +having been held in the context unlocking the crosslock, since the > > +crosslock was held until eventually it's unlocked, have dependency with > > +the crosslock. That's the key idea to implement crossrelease feature. > > + > > + > > +Introduce commit stage > > +---------------------- > > + > > +Crossrelease feature names it "commit", to check dependency and build > > +the dependency graph and chain in batches. Lockdep already performs what > > +the commit does, when acquiring a lock. This way it works for typical > > +lock, since it's guarrented that the acquire context is same as the > > +release context for that. However, that way must be changed for > > +crosslock so that it identify the release context when releasing the > > +crosslock and then performs the commit. > > + > > +Let's demonstrate it though several examples. > > + > > +The below is how the current lockdep works for typical lock. Note that > > +context 1 == context 2 for typical lock. > > + > > + CONTEXT 1 CONTEXT 2 > > + acquiring A releasing A > > + ------------- ------------- > > + acquire A acquire A > > + > > + acquire B acquire B -> build "A - B" > > + > > + acquire C acquire C -> build "A - C" > > + > > + release C release C > > + > > + release B release B > > + > > + release A release A > > + > > +After building "A - B", the dependency graph forms like, > > + > > +A - B > > + > > +And after building "A - C", the dependency graph forms like, > > + > > + - B > > + / > > +A - > > + \ > > + - C > > + > > +What if we apply the commit to lockdep for typical lock? Of course, it's > > +not necessary for typical lock. Just a example for what the commit does. > > + > > + CONTEXT 1 CONTEXT 2 > > + acquiring A releasing A > > + ------------- ------------- > > + acquire A acquire A -> mark A started > > + > > + acquire B acquire B -> queue B > > + > > + acquire C acquire C -> queue C > > + > > + release C release C > > + > > + release B release B > > + > > + release A release A -> commit A (build "A - B", "A - C") > > + > > +After commiting A, the dependency graph forms like, at a time > > + > > + - B > > + / > > +A - > > + \ > > + - C > > + > > +Here we can see both the former and the latter end in building a same > > +dependency graph consisting of "A - B" and "A - C". Of course, the > > +former can build the graph earlier than the latter, which means the > > +former can detect a deadlock sooner, maybe, as soon as possible. So the > > +former would be prefered if possible. > > + > > +Let's look at the way the commit works for crosslock. > > + > > + CONTEXT 1 CONTEXT 2 > > + acquiring A releasing A > > + ------------- ------------- > > + lock A > > + acquire A -> mark A started > > + > > + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <- serialized by some means > > + > > + acquire X -> queue X > > + lock X > > + acquire B > > + release X > > + acquire C > > + acquire Y -> queue Y > > + release C > > + release Y > > + release B > > + release A -> commit A (build "A - X", "A - Y") > > + > > +where A is a crosslock and the others are typical locks. > > + > > +Since a crosslock is held, it starts to queue all candidates whenever > > +acquiring typical lock, and keep it until finally it will be released. > > +Then it can commit all proper candidates queued until now. In other > > +words, it checks dependency and builds the dependency graph and chain > > +at the commit stage. > > + > > +And it is serialized so that the sequence, A -> X -> Y can be seen, > > +using atomic operations and proper barriers. > > + > > + > > +Acquire vs commit vs release > > +---------------------------- > > + > > +What the lockdep should do in each stage to make it work even for > > +crosslock is like, (Note that it does not change any current logic > > +by which lockdep works, but only adds additional detection capability.) > > + > > +1. Acquire > > + > > + 1) For typical lock > > + > > + It's queued into additional data structure so that the > > + commit stage can check depedndency and build the > > + dependency graph and chain with this later. > > + > > + 2) For crosslock > > + > > + It's added to the global linked list so that the commit > > + stage can check dependency and build the dependency > > + graph and chain with this later. > > + > > +2. Commit > > + > > + 1) For typical lock > > + > > + N/A. > > + > > + 2) For crosslock > > + > > + It checks dependency and builds the dependency graph and > > + chain with data saved in the acquire stage. Here, we > > + establish dependency between the crosslock we are > > + unlocking now and all locks in that context, having been > > + held since the lock was held. Of course, it avoids > > + unnecessary checking and building as far as possible. > > + > > +3. Release > > + > > + 1) For typical lock > > + > > + No change. > > + > > + 2) For crosslock > > + > > + Just remove this crosslock from the global linked list, > > + to which the crosslock was added at acquire stage. > > + Release operation should be used with commit operation > > + together for crosslock. > > + > > + > > +Data structures > > +--------------- > > + > > +Crossrelease feature introduces two new data structures. > > + > > +1. pend_lock (or plock) > > + > > + This is for keeping locks waiting to be committed so that the > > + actual dependency graph and chain can be built in the commit > > + stage. Every task_struct has a pend_lock array to keep these > > + locks. pend_lock entry will be consumed and filled whenever > > + lock_acquire() is called for typical lock and will be flushed, > > + namely committed at proper time. And this array is managed by > > + circular buffer mean. > > + > > +2. cross_lock (or xlock) > > + > > + This keeps some additional data only for crosslock. One > > + instance exists per one crosslock's lockdep_map. > > + lockdep_init_map_crosslock() should be used instead of > > + lockdep_init_map() to use a lock as a crosslock. > > + > > + > > +============= > > +Optimizations > > +============= > > + > > +Adding a pend_lock is an operation which very frequently happened > > +because it happens whenever a typical lock is acquired. So the operation > > +is implemented locklessly using rcu mechanism if possible. Unfortunitly, > > +we cannot apply this optimization if any object managed by rcu e.g. > > +xlock is on stack or somewhere else where it can be freed or destroyed > > +unpredictably. > > + > > +And chain cache for crosslock is also used to avoid unnecessary checking > > +and building dependency, like how the lockdep is already doing for that > > +purpose for typical lock. > > + > > -- > > 1.9.1 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>