Re: [PATCH 0/2] CodeSample: rwdeq

Dominik Dingel <dominik.dingel@xxxxxxxxx> · Wed, 19 Aug 2015 21:06:04 +0200

On Sun, 16 Aug 2015 23:22:19 -0700
"Paul E. McKenney" <paulmck@xxxxxxxxxxxxxxxxxx> wrote:

> On Thu, Aug 13, 2015 at 09:43:01AM -0700, Paul E. McKenney wrote:
> > On Wed, Aug 12, 2015 at 09:55:22AM +0200, Dominik Dingel wrote:
> > > From: Dominik Dingel <dingel@xxxxxxxxxxxxxxxxxx>
> > > 
> > > Hi all,
> > > 
> > > here is a somehow naive implementation of a reader/writer lock based approch
> > > for the parallel deq problem.
> > > 
> > > The next step would be to use it as an introduction example in the text, as due
> > > its way of implementation (taking always a read lock, working always with an
> > > atomic counter) it is a less efficient way of doing things, compared with the
> > > tandem approch. Then the benefits of partitioning might be even more obvious.
> > > 
> > > Thanks
> > > 	Dominik
> > 
> > Queued, thank you, Dominik!
> > 
> > A quick unscientific run leads me to believe that this is slightly faster
> > than the hashed implementation (not bad!), but about twice as slow as
> > the tandem version.  Of course, considerable variation is to be expected
> > based on workload.

Sounds good, during sniff tests I saw even slightly less performance than your hashed implementation. 

> And, as discussed earlier, here the commit adding a Quick Quiz referring
> to your algorithm.  Thoughts?

Looks good! Thanks for putting it in.
My plan, as soon as I get to:
- Rework the section to start with my solution
- Show the disadvantages of my solution
  -> Atomic and read/writer lock (resulting in bouncing cache lines)
- Show your tandem solution which will avoid that overhead and will only
in the worst case really do synchronization of the complete data structure.

What do you think?

Thanks
	Dominik

> 							Thanx, Paul
> 
> ------------------------------------------------------------------------
> 
>     Add Quick Quiz for Dominik Dingel's solution
>     
>     Signed-off-by: Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx>
> 
> diff --git a/SMPdesign/partexercises.tex b/SMPdesign/partexercises.tex
> index a44c285ce134..5d46e70028aa 100644
> --- a/SMPdesign/partexercises.tex
> +++ b/SMPdesign/partexercises.tex
> @@ -683,6 +683,15 @@ Michael~\cite{DBLP:conf/europar/Michael03}.\footnote{
>  	Instead, the common compare-and-swap (e.g., x86 cmpxchg)
>  	suffices.}
>  
> +\QuickQuiz{}
> +	Why are there not one but two solutions to the double-ended queue
> +	problem?
> +\QuickQuizAnswer{
> +	There are actually at least three.
> +	The third, by Dominik Dingel, makes interesting use of
> +	reader-writer locking, and may be found in \co{lockrwdeq.c}.
> +} \QuickQuizEnd
> +
>  In fact, as noted by Dice et al.~\cite{DavidDice:2010:SCA:HTM:deque},
>  an unsynchronized single-threaded double-ended queue significantly
>  outperforms any of the parallel implementations they studied.
> diff --git a/qqz.tex b/qqz.tex
> index 57cdc43f63d1..cfd0356854b7 100644
> --- a/qqz.tex
> +++ b/qqz.tex
> @@ -2561,6 +2561,14 @@ ret = ~ret & tmp;
>  	it is worthwhile) is left as an exercise for the reader.
>  
>  \QuickQ{}
> +	Why are there not one but two solutions to the double-ended queue
> +	problem?
> +\QuickA{}
> +	There are actually at least three.
> +	The third, by Dominik Dingel, makes interesting use of
> +	reader-writer locking, and may be found in \co{lockrwdeq.c}.
> +
> +\QuickQ{}
>  	The tandem double-ended queue runs about twice as fast as
>  	the hashed double-ended queue, even when I increase the
>  	size of the hash table to an insanely large number.
> 

--
To unsubscribe from this list: send the line "unsubscribe perfbook" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html