[RFC v2 13/13] lockdep: Add a document describing crossrelease feature

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Crossrelease feature introduces new concept and data structure. Thus
the document is necessary. So added it.

Signed-off-by: Byungchul Park <byungchul.park@xxxxxxx>
---
 Documentation/locking/crossrelease.txt | 457 +++++++++++++++++++++++++++++++++
 1 file changed, 457 insertions(+)
 create mode 100644 Documentation/locking/crossrelease.txt

diff --git a/Documentation/locking/crossrelease.txt b/Documentation/locking/crossrelease.txt
new file mode 100644
index 0000000..51b583b
--- /dev/null
+++ b/Documentation/locking/crossrelease.txt
@@ -0,0 +1,457 @@
+Crossrelease dependency check
+=============================
+
+Started by Byungchul Park <byungchul.park@xxxxxxx>
+
+Contents:
+
+ (*) Introduction.
+
+     - What the lockdep checks.
+
+ (*) What is a problem?
+
+     - Examples.
+     - Lockdep's assumption.
+     - Lockdep's limitation.
+
+ (*) How to solve the problem.
+
+     - What causes deadlock?
+     - Relax the assumption.
+     - Relax the limitation.
+
+ (*) Implementation.
+
+     - Introduce crosslock.
+     - Introduce commit stage.
+     - Acquire vs commit vs release.
+     - Data structures.
+
+ (*) Optimizations.
+
+
+============
+Introduction
+============
+
+What the lockdep checks
+-----------------------
+
+Lockdep checks dependency between locks and reports it if a deadlock
+possibility is detected in advance or if an actual deadlock occures at
+the time checking it.
+
+The former is more valuable because the possibility detection without
+lockdep is much harder. When an actual deadlock occures, we can identify
+what happens in the system by some means or other, even without lockdep.
+
+It checks dependency between locks (exactly lock classes, see
+Documentation/locking/lockdep-design.txt for more) and builds the
+relationship between them when the lock is acquired. Eventually it forms
+the global dependency graph which connects between related locks based
+on dependency they have, like e.g.
+
+A - B -       - F - G
+       \     /
+        - E -       - L
+       /     \     /
+C - D -       - H -
+                   \
+                    - I - K
+                   /
+                J -
+
+where A, B, C,..., L are different lock classes.
+
+Lockdep basically works based on this global dependency graph. The more
+nodes it has, the stronger check can be performed.
+
+For example, lockdep can detect a deadlock when acquiring a C class lock
+while it already held a K class lock, because it forms a cycle which
+means deadlock, like
+
+A - B -       - F - G
+       \     /
+        - E -       - L
+       /     \     /
+C - D -       - H -
+|                  \
+|                   - I - K
+|                  /      |
+|               J -       |
++-------------------------+
+
+where A, B, C,..., L are different lock classes.
+
+If any of nodes making up the cycle was not built into the graph, we
+cannot detect this kind of deadlock because there's no cycle. For
+example, it might be
+
+A - B -       - F - G
+       \     /
+        - E -       L
+       /
+C - D -
+|
+|                   - I - K
+|                  /      |
+|               J -       |
++-------------------------+
+
+where A, B, C,..., L are different lock classes.
+
+Adding nodes and edges gives us additional opportunity to check if it
+causes deadlock or not, so makes the detection stronger.
+
+
+=================
+What is a problem
+=================
+
+Examples
+--------
+
+Can we detect deadlocks descriped below, with lockdep? No.
+
+Example 1)
+
+	PROCESS X	PROCESS Y
+	--------------	--------------
+	mutext_lock A
+			lock_page B
+	lock_page B
+			mutext_lock A // DEADLOCK
+	unlock_page B
+			mutext_unlock A
+	mutex_unlock A
+			unlock_page B
+
+We are not checking the dependency for lock_page() at all now.
+
+Example 2)
+
+	PROCESS X	PROCESS Y	PROCESS Z
+	--------------	--------------	--------------
+			mutex_lock A
+	lock_page B
+			lock_page B
+					mutext_lock A // DEADLOCK
+					mutext_unlock A
+					unlock_page B
+					(B was held by PROCESS X)
+			unlock_page B
+			mutex_unlock A
+
+We cannot detect this kind of deadlock with lockdep, even though we
+apply the dependency check using lockdep on lock_page().
+
+Example 3)
+
+	PROCESS X	PROCESS Y
+	--------------	--------------
+			mutex_lock A
+	mutex_lock A
+	mutex_unlock A
+			wait_for_complete B // DEADLOCK
+	complete B
+			mutex_unlock A
+
+wait_for_complete() and complete() also can cause a deadlock, however
+we cannot detect it with lockdep, either.
+
+
+lockdep's assumption
+--------------------
+
+Lockdep (not crossrelease featured one) assumes that,
+
+	A lock will be unlocked within the context holding the lock.
+
+Thus we can say that a lock has dependency with all locks being already
+in held_locks. So we can check dependency and build the dependency graph
+when acquiring it.
+
+
+lockdep's limitation
+--------------------
+
+It can be applied only to typical lock operations, e.g. spin_lock,
+mutex and so on. However, e.g. lock_page() cannot play with the lockdep
+thingy because it violates assumptions of lockdep, even though it's
+considered as a lock for the page access.
+
+In the view point of lockdep, it must be released within the context
+which held the lock, however, the page lock can be released by different
+context from the context which held it. Another example,
+wait_for_complete() and complete() are also the case by nature, which
+the lockdep cannot deal with.
+
+
+========================
+How to solve the problem
+========================
+
+What causes deadlock
+--------------------
+
+Not only lock operations, but also any operations causing to wait or
+spin for something can cause deadlock unless it's eventually *released*
+by someone. The important point here is that the waiting or spinning
+must be *released* by someone. In other words, we have to focus whether
+the waiting or spinning can be *released* or not to check a deadlock
+possibility, rather than the waiting or spinning itself.
+
+In this point of view, typical lock is a special case where the acquire
+context is same as the release context, so no matter in which context
+the checking is performed for typical lock.
+
+Of course, in order to be able to report deadlock imediately at the time
+real deadlock actually occures, the checking must be performed before
+actual blocking or spinning happens when acquiring it. However, deadlock
+*possibility* can be detected and reported even the checking is done
+when releasing it, which means the time we can identify the release
+context.
+
+
+Relax the assumption
+--------------------
+
+Since whether waiting or spinning can be released or not is more
+important to check deadlock possibility as decribed above, we can relax
+the assumtion the lockdep has.
+
+	A lock can be unlocked in any context, unless the context
+	itself causes a deadlock e.g. acquiring a lock in irq-safe
+	context before releasing the lock in irq-unsafe context.
+
+A lock has dependency with all locks in the release context, having been
+held since the lock was held. Thus basically we can check the dependency
+only after we identify the release context at first. Of course, we can
+identify the release context at the time acquiring a lock for typical
+lock because the acquire context is same as the release context.
+
+However, generally if we cannot identify the release context at the time
+acquiring a lock, we have to wait until the lock having been held will
+be eventually released to identify the release context.
+
+
+Relax the limitation
+--------------------
+
+Given that the assumption is relaxed, we can check dependency and detect
+deadlock possibility not only for typical lock, but also for lock_page()
+using PG_locked, wait_for_xxx() and so on which might be released by
+different context from the context which held the lock.
+
+
+==============
+Implementation
+==============
+
+Introduce crosslock
+-------------------
+
+Crossrelease feature names a lock "crosslock" if it is releasable by a
+different context from the context which held the lock. All locks
+having been held in the context unlocking the crosslock, since the
+crosslock was held until eventually it's unlocked, have dependency with
+the crosslock. That's the key idea to implement crossrelease feature.
+
+
+Introduce commit stage
+----------------------
+
+Crossrelease feature names it "commit", to check dependency and build
+the dependency graph and chain in batches. Lockdep already performs what
+the commit does, when acquiring a lock. This way it works for typical
+lock, since it's guarrented that the acquire context is same as the
+release context for that. However, that way must be changed for
+crosslock so that it identify the release context when releasing the
+crosslock and then performs the commit.
+
+Let's demonstrate it though several examples.
+
+The below is how the current lockdep works for typical lock. Note that
+context 1 == context 2 for typical lock.
+
+	CONTEXT 1	CONTEXT 2
+	acquiring A	releasing A
+	-------------	-------------
+	acquire A	acquire A
+
+	acquire B	acquire B -> build "A - B"
+
+	acquire C	acquire C -> build "A - C"
+
+	release C	release C
+
+	release B	release B
+
+	release A	release A
+
+After building "A - B", the dependency graph forms like,
+
+A - B
+
+And after building "A - C", the dependency graph forms like,
+
+    - B
+   /
+A -
+   \
+    - C
+
+What if we apply the commit to lockdep for typical lock? Of course, it's
+not necessary for typical lock. Just a example for what the commit does.
+
+	CONTEXT 1	CONTEXT 2
+	acquiring A	releasing A
+	-------------	-------------
+	acquire A	acquire A -> mark A started
+
+	acquire B	acquire B -> queue B
+
+	acquire C	acquire C -> queue C
+
+	release C	release C
+
+	release B	release B
+
+	release A	release A -> commit A (build "A - B", "A - C")
+
+After commiting A, the dependency graph forms like, at a time
+
+    - B
+   /
+A -
+   \
+    - C
+
+Here we can see both the former and the latter end in building a same
+dependency graph consisting of "A - B" and "A - C". Of course, the
+former can build the graph earlier than the latter, which means the
+former can detect a deadlock sooner, maybe, as soon as possible. So the
+former would be prefered if possible.
+
+Let's look at the way the commit works for crosslock.
+
+	CONTEXT 1	CONTEXT 2
+	acquiring A	releasing A
+	-------------	-------------
+	lock A
+	acquire A -> mark A started
+
+	~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <- serialized by some means
+
+			acquire X -> queue X
+			lock X
+	acquire B
+			release X
+	acquire C
+			acquire Y -> queue Y
+	release C
+			release Y
+	release B
+			release A -> commit A (build "A - X", "A - Y")
+
+where A is a crosslock and the others are typical locks.
+
+Since a crosslock is held, it starts to queue all candidates whenever
+acquiring typical lock, and keep it until finally it will be released.
+Then it can commit all proper candidates queued until now. In other
+words, it checks dependency and builds the dependency graph and chain
+at the commit stage.
+
+And it is serialized so that the sequence, A -> X -> Y can be seen,
+using atomic operations and proper barriers.
+
+
+Acquire vs commit vs release
+----------------------------
+
+What the lockdep should do in each stage to make it work even for
+crosslock is like, (Note that it does not change any current logic
+by which lockdep works, but only adds additional detection capability.)
+
+1. Acquire
+
+	1) For typical lock
+
+		It's queued into additional data structure so that the
+		commit stage can check depedndency and build the
+		dependency graph and chain with this later.
+
+	2) For crosslock
+
+		It's added to the global linked list so that the commit
+		stage can check dependency and build the dependency
+		graph and chain with this later.
+
+2. Commit
+
+	1) For typical lock
+
+		N/A.
+
+	2) For crosslock
+
+		It checks dependency and builds the dependency graph and
+		chain with data saved in the acquire stage. Here, we
+		establish dependency between the crosslock we are
+		unlocking now and all locks in that context, having been
+		held since the lock was held. Of course, it avoids
+		unnecessary checking and building as far as possible.
+
+3. Release
+
+	1) For typical lock
+
+		No change.
+
+	2) For crosslock
+
+		Just remove this crosslock from the global linked list,
+		to which the crosslock was added at acquire stage.
+		Release operation should be used with commit operation
+		together for crosslock.
+
+
+Data structures
+---------------
+
+Crossrelease feature introduces two new data structures.
+
+1. pend_lock (or plock)
+
+	This is for keeping locks waiting to be committed so that the
+	actual dependency graph and chain can be built in the commit
+	stage. Every task_struct has a pend_lock array to keep these
+	locks. pend_lock entry will be consumed and filled whenever
+	lock_acquire() is called for typical lock and will be flushed,
+	namely committed at proper time. And this array is managed by
+	circular buffer mean.
+
+2. cross_lock (or xlock)
+
+	This keeps some additional data only for crosslock. One
+	instance exists per one crosslock's lockdep_map.
+	lockdep_init_map_crosslock() should be used instead of
+	lockdep_init_map() to use a lock as a crosslock.
+
+
+=============
+Optimizations
+=============
+
+Adding a pend_lock is an operation which very frequently happened
+because it happens whenever a typical lock is acquired. So the operation
+is implemented locklessly using rcu mechanism if possible. Unfortunitly,
+we cannot apply this optimization if any object managed by rcu e.g.
+xlock is on stack or somewhere else where it can be freed or destroyed
+unpredictably.
+
+And chain cache for crosslock is also used to avoid unnecessary checking
+and building dependency, like how the lockdep is already doing for that
+purpose for typical lock.
+
-- 
1.9.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>



[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]