Re: [PATCH] locking/mutexes: Revert "locking/mutexes: Add extra reschedule point"

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jul 31, 2014 at 02:16:37PM +0400, Ilya Dryomov wrote:
> This reverts commit 34c6bc2c919a55e5ad4e698510a2f35ee13ab900.
> 
> This commit can lead to deadlocks by way of what at a high level
> appears to look like a missing wakeup on mutex_unlock() when
> CONFIG_MUTEX_SPIN_ON_OWNER is set, which is how most distributions ship
> their kernels.  In particular, it causes reproducible deadlocks in
> libceph/rbd code under higher than moderate loads with the evidence
> actually pointing to the bowels of mutex_lock().
> 
> kernel/locking/mutex.c, __mutex_lock_common():
> 476         osq_unlock(&lock->osq);
> 477 slowpath:
> 478         /*
> 479          * If we fell out of the spin path because of need_resched(),
> 480          * reschedule now, before we try-lock the mutex. This avoids getting
> 481          * scheduled out right after we obtained the mutex.
> 482          */
> 483         if (need_resched())
> 484                 schedule_preempt_disabled(); <-- never returns
> 485 #endif
> 486         spin_lock_mutex(&lock->wait_lock, flags);
> 
> We started bumping into deadlocks in QA the day our branch has been
> rebased onto 3.15 (the release this commit went in) but then as part of
> debugging effort I enabled all locking debug options, which also
> disabled CONFIG_MUTEX_SPIN_ON_OWNER and made everything disappear,
> which is why it hasn't been looked into until now.  Revert makes the
> problem go away, confirmed by our users.

This doesn't make sense and you fail to explain how this can possibly
deadlock.

Attachment: pgp_uC68N77u1.pgp
Description: PGP signature


[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux