>From 32e5297fa529da9947e2a5a6450387bacf8c3d93 Mon Sep 17 00:00:00 2001 From: Akira Yokosawa <akiyks@xxxxxxxxx> Date: Wed, 20 Dec 2017 19:50:31 +0900 Subject: [PATCH 2/4] count: Get rid of ACCESS_ONCE() in text Also adjust context by: o promoting the definition in Section 4.2.5 to a listing, o referencing the listing from footnotes in QQA instead of citing LWN article "ACCESS_ONCE", and o moving QQ on atomic operations in Section 4.2.5 before the discussion on memory/compiler barriers. NOTE: ACCESS_ONCE() remains in the definition of {READ|WRITE}_ONCE(). Signed-off-by: Akira Yokosawa <akiyks@xxxxxxxxx> --- count/count.tex | 17 ++++++++++++----- toolsoftrade/toolsoftrade.tex | 40 ++++++++++++++++++++++------------------ 2 files changed, 34 insertions(+), 23 deletions(-) diff --git a/count/count.tex b/count/count.tex index 277627d..752371f 100644 --- a/count/count.tex +++ b/count/count.tex @@ -551,8 +551,12 @@ normal loads suffice, and no special atomic instructions are required. but until then, we depend on the kindness of the \GCC\ developers. Alternatively, use of volatile accesses such as those provided - by \co{ACCESS_ONCE()}~\cite{JonCorbet2012ACCESS:ONCE} - can help constrain the compiler, at least + by \co{READ_ONCE()} and \co{WRITE_ONCE()} can help constrain + the compiler,\footnote{ + Simple definitions of \co{READ_ONCE()} and + \co{WRITE_ONCE()} are shown in + Listing~\ref{lst:toolsoftrade:Compiler Barrier Primitive (for GCC)}.} + at least in cases where the hardware is capable of accessing the value with a single memory-reference instruction. } \QuickQuizEnd @@ -842,10 +846,12 @@ comes at the cost of the additional thread running \co{eventual()}. Because one of the two threads only reads, and because the variable is aligned and machine-sized, non-atomic instructions suffice. - That said, the \co{ACCESS_ONCE()} macro is used to prevent + That said, the \co{READ_ONCE()} macro is used to prevent compiler optimizations that might otherwise prevent the counter updates from becoming visible to - \co{eventual()}~\cite{JonCorbet2012ACCESS:ONCE}. + \co{eventual()}.\footnote{ + A simple definition of \co{READ_ONCE()} is shown in + Listing~\ref{lst:toolsoftrade:Compiler Barrier Primitive (for GCC)}.} An older version of this algorithm did in fact use atomic instructions, kudos to Ersoy Bayramoglu for pointing out that @@ -2798,7 +2804,8 @@ state to READY. \QuickQuiz{} In Listing~\ref{lst:count:Signal-Theft Limit Counter Value-Migration Functions} function \co{flush_local_count_sig()}, why are there - \co{ACCESS_ONCE()} wrappers around the uses of the + \co{READ_ONCE()} and \co{WRITE_ONCE()} wrappers around + the uses of the \co{theft} per-thread variable? \QuickQuizAnswer{ The first one (on line~11) can be argued to be unnecessary. diff --git a/toolsoftrade/toolsoftrade.tex b/toolsoftrade/toolsoftrade.tex index 771d8a9..855ab61 100644 --- a/toolsoftrade/toolsoftrade.tex +++ b/toolsoftrade/toolsoftrade.tex @@ -1236,6 +1236,16 @@ for a wider set of atomic operations, though the more elaborate of these often suffer from complexity, scalability, and performance problems~\cite{MauriceHerlihy90a}. +\QuickQuiz{} + Given that these atomic operations will often be able to + generate single atomic instructions that are directly + supported by the underlying instruction set, shouldn't + they be the fastest possible way to get things done? +\QuickQuizAnswer{ + Unfortunately, no. + See Chapter~\ref{chp:Counting} for some stark counterexamples. +} \QuickQuizEnd + The \co{__sync_synchronize()} primitive issues a ``memory barrier'', which constrains both the compiler's and the CPU's ability to reorder operations, as discussed in @@ -1252,30 +1262,24 @@ Listing~\ref{lst:toolsoftrade:Demonstration of Exclusive Locks}. Similarly, the \co{WRITE_ONCE()} primitive may be used to prevent the compiler from optimizing away a given memory write. These last three primitives are not provided directly by \GCC, -but may be implemented straightforwardly as follows: +but may be implemented straightforwardly as shown in +Listing~\ref{lst:toolsoftrade:Compiler Barrier Primitive (for GCC)}. -\vspace{5pt} -\begin{minipage}[t]{\columnwidth} -\scriptsize -\begin{verbatim} +\begin{listing}[tb] +{ \scriptsize +\begin{verbbox} #define ACCESS_ONCE(x) (*(volatile typeof(x) *)&(x)) #define READ_ONCE(x) \ ({ typeof(x) ___x = ACCESS_ONCE(x); ___x; }) #define WRITE_ONCE(x, val) ({ ACCESS_ONCE(x) = (val); }) #define barrier() __asm__ __volatile__("": : :"memory") -\end{verbatim} -\end{minipage} -\vspace{5pt} - -\QuickQuiz{} - Given that these atomic operations will often be able to - generate single atomic instructions that are directly - supported by the underlying instruction set, shouldn't - they be the fastest possible way to get things done? -\QuickQuizAnswer{ - Unfortunately, no. - See Chapter~\ref{chp:Counting} for some stark counterexamples. -} \QuickQuizEnd +\end{verbbox} +} +\centering +\theverbbox +\caption{Compiler Barrier Primitive (for \GCC)} +\label{lst:toolsoftrade:Compiler Barrier Primitive (for GCC)} +\end{listing} \subsection{Atomic Operations (C11)} \label{sec:toolsoftrade:Atomic Operations (C11)} -- 2.7.4 -- To unsubscribe from this list: send the line "unsubscribe perfbook" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html