Re: [PATCH v3 2/2] locking/qrwlock: Don't contend with readers when setting _QW_WAITING

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Jun 15, 2015 at 11:24:03PM +0100, Waiman Long wrote:
> The current cmpxchg() loop in setting the _QW_WAITING flag for writers
> in queue_write_lock_slowpath() will contend with incoming readers
> causing possibly extra cmpxchg() operations that are wasteful. This
> patch changes the code to do a byte cmpxchg() to eliminate contention
> with new readers.
> 
> A multithreaded microbenchmark running 5M read_lock/write_lock loop
> on a 8-socket 80-core Westmere-EX machine running 4.0 based kernel
> with the qspinlock patch have the following execution times (in ms)
> with and without the patch:
> 
> With R:W ratio = 5:1
> 
> 	Threads	   w/o patch	with patch	% change
> 	-------	   ---------	----------	--------
> 	   2	     990 	    895		  -9.6%
> 	   3	    2136 	   1912		 -10.5%
> 	   4	    3166	   2830		 -10.6%
> 	   5	    3953	   3629		  -8.2%
> 	   6	    4628	   4405		  -4.8%
> 	   7	    5344	   5197		  -2.8%
> 	   8	    6065	   6004		  -1.0%
> 	   9	    6826	   6811		  -0.2%
> 	  10	    7599	   7599		   0.0%
> 	  15	    9757	   9766		  +0.1%
> 	  20	   13767	  13817		  +0.4%
> 
> With small number of contending threads, this patch can improve
> locking performance by up to 10%. With more contending threads,
> however, the gain diminishes.
> 
> Signed-off-by: Waiman Long <Waiman.Long@xxxxxx>
> ---
>  kernel/locking/qrwlock.c |   28 ++++++++++++++++++++++++----
>  1 files changed, 24 insertions(+), 4 deletions(-)
> 
> diff --git a/kernel/locking/qrwlock.c b/kernel/locking/qrwlock.c
> index d7d7557..559198a 100644
> --- a/kernel/locking/qrwlock.c
> +++ b/kernel/locking/qrwlock.c
> @@ -22,6 +22,26 @@
>  #include <linux/hardirq.h>
>  #include <asm/qrwlock.h>
>  
> +/*
> + * This internal data structure is used for optimizing access to some of
> + * the subfields within the atomic_t cnts.
> + */
> +struct __qrwlock {
> +	union {
> +		atomic_t cnts;
> +		struct {
> +#ifdef __LITTLE_ENDIAN
> +			u8 wmode;	/* Writer mode   */
> +			u8 rcnts[3];	/* Reader counts */
> +#else
> +			u8 rcnts[3];	/* Reader counts */
> +			u8 wmode;	/* Writer mode   */
> +#endif
> +		};
> +	};
> +	arch_spinlock_t	lock;
> +};
> +
>  /**
>   * rspin_until_writer_unlock - inc reader count & spin until writer is gone
>   * @lock  : Pointer to queue rwlock structure
> @@ -109,10 +129,10 @@ void queue_write_lock_slowpath(struct qrwlock *lock)
>  	 * or wait for a previous writer to go away.
>  	 */
>  	for (;;) {
> -		cnts = atomic_read(&lock->cnts);
> -		if (!(cnts & _QW_WMASK) &&
> -		    (atomic_cmpxchg(&lock->cnts, cnts,
> -				    cnts | _QW_WAITING) == cnts))
> +		struct __qrwlock *l = (struct __qrwlock *)lock;
> +
> +		if (!READ_ONCE(l->wmode) &&
> +		   (cmpxchg(&l->wmode, 0, _QW_WAITING) == 0))
>  			break;

Maybe you could also update the x86 implementation of queue_write_unlock
to write the wmode field instead of casting to u8 *?

Will
--
To unsubscribe from this list: send the line "unsubscribe linux-arch" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Kernel]     [Kernel Newbies]     [x86 Platform Driver]     [Netdev]     [Linux Wireless]     [Netfilter]     [Bugtraq]     [Linux Filesystems]     [Yosemite Discussion]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Device Mapper]

  Powered by Linux