Re: [PATCH v6 04/11] qspinlock: Optimized code path for 2 contending tasks

Waiman Long <waiman.long@xxxxxx> · Mon, 17 Mar 2014 13:23:55 -0400

On 03/13/2014 09:57 AM, Peter Zijlstra wrote:
On Wed, Mar 12, 2014 at 03:08:24PM -0400, Waiman Long wrote:
On 03/12/2014 02:54 PM, Waiman Long wrote:
+		/*
+		 * Set the lock bit&   clear the waiting bit simultaneously
+		 * It is assumed that there is no lock stealing with this
+		 * quick path active.
+		 *
+		 * A direct memory store of _QSPINLOCK_LOCKED into the
+		 * lock_wait field causes problem with the lockref code, e.g.
+		 *   ACCESS_ONCE(qlock->lock_wait) = _QSPINLOCK_LOCKED;
+		 *
+		 * It is not currently clear why this happens. A workaround
+		 * is to use atomic instruction to store the new value.
+		 */
+		{
+			u16 lw = xchg(&qlock->lock_wait, _QSPINLOCK_LOCKED);
+			BUG_ON(lw != _QSPINLOCK_WAITING);
+		}
It was found that when I used a direct memory store instead of an atomic op,
the following kernel crash might happen at filesystem dismount time:

[ 1529.936714] Call Trace:
[ 1529.936714]  [<ffffffff811c2d03>] d_walk+0xc3/0x260
[ 1529.936714]  [<ffffffff811c1770>] ? check_and_collect+0x30/0x30
[ 1529.936714]  [<ffffffff811c3985>] shrink_dcache_for_umount+0x75/0x120
[ 1529.936714]  [<ffffffff811adf21>] generic_shutdown_super+0x21/0xf0
[ 1529.936714]  [<ffffffff811ae207>] kill_block_super+0x27/0x70
[ 1529.936714]  [<ffffffff811ae4ed>] deactivate_locked_super+0x3d/0x60
[ 1529.936714]  [<ffffffff811aea96>] deactivate_super+0x46/0x60
[ 1529.936714]  [<ffffffff811ca277>] mntput_no_expire+0xa7/0x140
[ 1529.936714]  [<ffffffff811cb6ce>] SyS_umount+0x8e/0x100
[ 1529.936714]  [<ffffffff815d2c29>] system_call_fastpath+0x16/0x1b
It was more readily reproducible in a KVM guest. It was harder to reproduce
in a bare metal machine, but kernel crash still happened after several
tries.

I am not sure what exactly cause this crash, but it will have something to
do with the interaction between the lockref and the qspinlock code. I would
like more eyes on that to find the root cause of it.
I cannot reproduce with my series that has the one word write.

What I did was I made my swap partition (who needs that anyway on a
machine with 16G of memory) into an XFS partition.

Then I copied my linux.git onto it and unmounted.

I'll try a few more times; the above trace seems to suggest it happens
during dcache cleanup, so I suppose I should read the filesystem some
and unmount again.

Is there anything specific you did to make it go bang?

I had found the reason for the crash, it has to do with my original 
definition of the queue_spin_value_unlocked() function. When I extended 
it to cover the first 2 bytes (lock + wait bit), the problem is gone.

-Longman
_______________________________________________
Virtualization mailing list
Virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx
https://lists.linuxfoundation.org/mailman/listinfo/virtualization