Re: [PATCH] rtmutex: enable deadlock detection in ww_mutex_lock functions

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I'm not sure if those patches are connected, so I am posting my findings for anyone.

ABOUT NOUVEAU AND WW_MUTEX:
The nouveau driver intensely uses the ww_mutex. According the ww-mutex-design.txt documentation, the ww_mutex makes the following steps when try to lock a resource:

1) It calls the ww_mutex_lock (or ww_mutex_lock_interruptible) function for normal lock acquisition. 2) If it returns -EDEADLK, the wounded task must drop all already acquired locks and call the ww_mutex_lock_slow (or ww_mutex_lock_slow_interruptible) function.

In the nouveau driver, those steps happen in the validate_init function (drivers/gpu/drm/nouveau_gem.c) when it calls the functions ttm_bo_reserve (step 1) and ttm_bo_reserve_slowpath (step 2):

validate_init(...)
{
...
		ret = ttm_bo_reserve(&nvbo->bo, true, false, true, &op->ticket);
		if (ret) {
			validate_fini_no_ticket(op, NULL);
			if (unlikely(ret == -EDEADLK)) {
				ret = ttm_bo_reserve_slowpath(&nvbo->bo, true,
							      &op->ticket);
...
}

THE PROBLEM:
When I run the PREEMPT_RT in my computer, it freezes as soon I start some big graphical application (e.g. Firefox or Chrome). The following execution path happens until reaching an infinite loop:

a) nouveau_gem_ioctl_pushbuf()
b) nouveau_gem_pushbuf_validate()
c) validate_init()
d) ttm_bo_reserve()
e) ttm_bo_reserve_nolru()
f) ww_mutex_lock_interruptible()
g) __ww_mutex_lock_interruptible()
h) rt_mutex_slowlock()
i) rt_mutex_handle_deadlock() -> infinite loop

The infinite loop that I mentioned above is the while(1){...} inside the rt_mutex_handle_deadlock function:

static void rt_mutex_handle_deadlock(int res, int detect_deadlock,
                     struct rt_mutex_waiter *w)
{
    /*
     * If the result is not -EDEADLOCK or the caller requested
     * deadlock detection, nothing to do here.
     */
    if (res != -EDEADLOCK || detect_deadlock)
        return;

    /*
     * Yell lowdly and stop the task right here.
     */
    rt_mutex_print_deadlock(w);
    while (1) {
        set_current_state(TASK_INTERRUPTIBLE);
        schedule();
    }
}

CONCLUSION:
My conclusion was the detect_deadlock should be enabled to the rt_mutex_handle_deadlock function returns -EDEADLK and hence the ww_mutex_lock_interruptible returns -EDEADLK too. BTW, since I have applied my patch, I've not had anymore freezing issue.

PS: I am a kernelnewbie so it is possible that I am writing some BS here.

On 01/25/2015 06:55 PM, Paul Gortmaker wrote:
On Tue, Jan 20, 2015 at 3:02 PM, Gustavo Bittencourt <gbitten@xxxxxxxxx> wrote:
The functions ww_mutex_lock_interruptible and ww_mutex_lock should return -EDEADLK when faced with
a deadlock. To do so, the paramenter detect_deadlock in rt_mutex_slowlock must be TRUE.
This patch corrects potential deadlocks when running PREEMPT_RT with nouveau driver.

Kernel version: 3.14.25-rt22

You might want to have a look at this:

http://www.spinics.net/lists/linux-rt-users/msg12259.html

as it also impacts what the deadlock detection flag does.

Paul.
--


Signed-off-by: Gustavo Bittencourt <gbitten@xxxxxxxxx>
---
  kernel/locking/rtmutex.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c
index 6c40660..3f6ef91 100644
--- a/kernel/locking/rtmutex.c
+++ b/kernel/locking/rtmutex.c
@@ -1965,7 +1965,7 @@ __ww_mutex_lock_interruptible(struct ww_mutex *lock, struct ww_acquire_ctx *ww_c
         might_sleep();

         mutex_acquire(&lock->base.dep_map, 0, 0, _RET_IP_);
-       ret = rt_mutex_slowlock(&lock->base.lock, TASK_INTERRUPTIBLE, NULL, 0, ww_ctx);
+       ret = rt_mutex_slowlock(&lock->base.lock, TASK_INTERRUPTIBLE, NULL, 1, ww_ctx);
         if (ret)
                 mutex_release(&lock->base.dep_map, 1, _RET_IP_);
         else if (!ret && ww_ctx->acquired > 1)
@@ -1984,7 +1984,7 @@ __ww_mutex_lock(struct ww_mutex *lock, struct ww_acquire_ctx *ww_ctx)

         mutex_acquire_nest(&lock->base.dep_map, 0, 0, &ww_ctx->dep_map,
                         _RET_IP_);
-       ret = rt_mutex_slowlock(&lock->base.lock, TASK_UNINTERRUPTIBLE, NULL, 0, ww_ctx);
+       ret = rt_mutex_slowlock(&lock->base.lock, TASK_UNINTERRUPTIBLE, NULL, 1, ww_ctx);
         if (ret)
                 mutex_release(&lock->base.dep_map, 1, _RET_IP_);
         else if (!ret && ww_ctx->acquired > 1)
--
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [RT Stable]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]

  Powered by Linux