Re: [PATCH] advsync: Fix store-buffering sequence table

"Paul E. McKenney" <paulmck@xxxxxxxxxxxxxxxxxx> · Wed, 5 Jul 2017 08:40:24 -0700

On Wed, Jul 05, 2017 at 11:22:52PM +0900, Akira Yokosawa wrote:
> On 2017/07/04 15:21:38 -0700, Paul E. McKenney wrote:
> > On Wed, Jul 05, 2017 at 12:23:09AM +0900, Akira Yokosawa wrote:
> >> >From 2845eb208a6e63493997de47293a47ef774a9d49 Mon Sep 17 00:00:00 2001
> >> From: Akira Yokosawa <akiyks@xxxxxxxxx>
> >> Date: Tue, 4 Jul 2017 23:18:30 +0900
> >> Subject: [PATCH] advsync: Fix store-buffering sequence table
> >>
> >> Row 6 of the table added in commit 2d5bf8d25a71 ("advsync: Add
> >> memory-barriered store-buffering example") needs some context
> >> adjustment.
> >>
> >> Also tweak horizontal spacing of wide tables for one-column layout.
> >> Also add a few words to the footnote giving definition of
> >> __atomic_thread_fence().
> >>
> >> Signed-off-by: Akira Yokosawa <akiyks@xxxxxxxxx>
> > 
> > Good catches!  Queued and pushed.  I reworded the footnote a bit, so
> > please let me know if I overdid it.
> 
> After your changes in commit 036372ac2573 ("advsync: Use gcc's C11-like
> intrinsics to avoid data races"), this footnote seems verbose, doesn't it?
> 
> But, I'm not so much a fan of the changes of your commit.
> It becomes hard to see the relation of lines in litmus tests and rows
> in the tables. Also, those intrinsics have fairly large overheads.

They certainly are ugly, no two ways about that!  ;-)

> How about using "volatile" in thread arg declaration such as the following?
> 
> ---
> C C-SB+o-o+o-o
> {
> }
> 
> P0(volatile int *x0, volatile int *x1)
> {
> 	int r2;
> 
> 	*x0 = 2;
> 	r2 = *x1;
> }
> 
> 
> P1(volatile int *x0, volatile int *x1)
> {
> 	int r2;
> 
> 	*x1 = 2;
> 	r2 = *x0;
> }
> 
> exists (1:r2=0 /\ 0:r2=0)
> ---
> 
> If all you need is to prevent memory accesses from being optimized away,
> they should suffice. But they might be unpopular among kernel community.

To say nothing of their unpopularity among the C11 and C++11
communities!

> I checked the generated C code in cross-compiling mode of litmus7, and
> the volatile-ness is reflected there.

And it also works just fine without the volatile -- the litmus7 tool
does the translation so as to avoid destructive compiler optimizations.

I am checking with the litmus7 people to see if there is any way to
map identifiers.  Some of the other tools support a "-macros"
command-line argument, which would allow mapping from "smp_mb()" to
"__atomic_thread_fence(__ATOMIC_SEQ_CST)", but not litmus7.

So I cannot go with "volatile", but let's see if I can do something
better than the gcc intrinsics.

							Thanx, Paul

> Thoughts?
> 
>           Thanks, Akira
> 
> > 
> > 							Thanx, Paul
> > 
> >> ---
> >>  advsync/memorybarriers.tex | 10 ++++++----
> >>  1 file changed, 6 insertions(+), 4 deletions(-)
> >>
> >> diff --git a/advsync/memorybarriers.tex b/advsync/memorybarriers.tex
> >> index 4ae3ca8..f26a7c5 100644
> >> --- a/advsync/memorybarriers.tex
> >> +++ b/advsync/memorybarriers.tex
> >> @@ -174,7 +174,7 @@ Figure~\ref{fig:advsync:Memory Misordering: Store-Buffering Litmus Test}.
> >>
> >>  \begin{table*}
> >>  \small
> >> -\centering
> >> +\centering\OneColumnHSpace{-.1in}
> >>  \begin{tabular}{r||l|l|l||l|l|l}
> >>  	& \multicolumn{3}{c||}{CPU 0} & \multicolumn{3}{c}{CPU 1} \\
> >>  	\cline{2-7}
> >> @@ -318,6 +318,8 @@ ordering and memory barriers work, read on!
> >>  The first stop is
> >>  Figure~\ref{fig:advsync:Memory Ordering: Store-Buffering Litmus Test},
> >>  which has \co{__atomic_thread_fence()} directives\footnote{
> >> +	One of GCC's atomic intrinsics briefly introduced in
> >> +	Section~\ref{sec:toolsoftrade:Atomic Operations (C11)}.
> >>  	Similar to the Linux kernel's \co{smp_mb()} full memory barrier.}
> >>  placed between
> >>  the store and load in both \co{P0()} and \co{P1()}, but is otherwise
> >> @@ -339,7 +341,7 @@ Figure~\ref{fig:advsync:Memory Misordering: Store-Buffering Litmus Test}.
> >>
> >>  \begin{table*}
> >>  \small
> >> -\centering
> >> +\centering\OneColumnHSpace{-0.75in}
> >>  \begin{tabular}{r||l|l|l||l|l|l}
> >>  	& \multicolumn{3}{c||}{CPU 0} & \multicolumn{3}{c}{CPU 1} \\
> >>  	\cline{2-7}
> >> @@ -362,8 +364,8 @@ Figure~\ref{fig:advsync:Memory Misordering: Store-Buffering Litmus Test}.
> >>  	5 & (Finish store) & & \tco{x0==2} &
> >>  		(Finish store) & & \tco{x1==2} \\
> >>  	\hline
> >> -	6 & \tco{r2 = *x1;} (2) & \tco{x0==2} & \tco{x1==0} &
> >> -		\tco{r2 = *x0;} (2) & \tco{x1==2} & \tco{x0==0} \\
> >> +	6 & \tco{r2 = *x1;} (2) & & \tco{x1==2} &
> >> +		\tco{r2 = *x0;} (2) & & \tco{x0==2} \\
> >>  \end{tabular}
> >>  \caption{Memory Ordering: Store-Buffering Sequence of Events}
> >>  \label{tab:advsync:Memory Ordering: Store-Buffering Sequence of Events}
> >> -- 
> >> 2.7.4
> >>
> > 
> > 
> 

--
To unsubscribe from this list: send the line "unsubscribe perfbook" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html