>From a13120c31bf160a5996c06d66f5e23ed9f1939f0 Mon Sep 17 00:00:00 2001 From: Akira Yokosawa <akiyks@xxxxxxxxx> Date: Mon, 6 Jan 2020 17:34:16 +0900 Subject: [PATCH 2/6] Prevent section heading from orphaned Putting "\NoIndentAfterThis" where source of floats comes just below section headings has a failure mode of possible orphaned heading (heading at the bottom of a page/column). E.g.: Section 10.2.2's heading in perfbook-1c.pdf (before this update) Moving such source of floats to next to the first paragraph of the section is the right way mentioned in Section 4.7 of [1]. Exception to this rule is at or near the beginning of a chapter, where a float object has no chance to cause an actual page break. This commit therefore moves such floats and removes \NoIndentAfterThis. Some of the floats not at the beginning of a section are also moved to avoid congestion of floats. [1]: https://www.latex-project.org/publications/2014-FMi-TUB-tb111mitt-float-placement.pdf Signed-off-by: Akira Yokosawa <akiyks@xxxxxxxxx> --- SMPdesign/beyond.tex | 13 ++- appendix/toyrcu/toyrcu.tex | 167 +++++++++++++++---------------- count/count.tex | 86 ++++++++-------- datastruct/datastruct.tex | 77 +++++++-------- defer/rcuapi.tex | 32 +++--- defer/rcuusage.tex | 15 ++- formal/axiomatic.tex | 24 ++--- formal/dyntickrcu.tex | 194 ++++++++++++++++++------------------- formal/ppcmem.tex | 14 +-- formal/spinhint.tex | 59 ++++++----- locking/locking.tex | 79 ++++++++------- memorder/memorder.tex | 60 ++++++------ together/applyrcu.tex | 48 ++++----- 13 files changed, 423 insertions(+), 445 deletions(-) diff --git a/SMPdesign/beyond.tex b/SMPdesign/beyond.tex index 6a74e613..cb0008c2 100644 --- a/SMPdesign/beyond.tex +++ b/SMPdesign/beyond.tex @@ -57,7 +57,12 @@ presents future directions and concluding remarks. \subsection{Work-Queue Parallel Maze Solver} \label{sec:SMPdesign:Work-Queue Parallel Maze Solver} -\NoIndentAfterThis + +PWQ is based on SEQ, which is shown in +Listing~\ref{lst:SMPdesign:SEQ Pseudocode} +(pseudocode for \path{maze_seq.c}). +The maze is represented by a 2D array of cells and +a linear-array-based work queue named \co{->visited}. \begin{listing}[tbp] \begin{linelabel}[ln:SMPdesign:SEQ Pseudocode] @@ -90,12 +95,6 @@ int maze_solve(maze *mp, cell sc, cell ec) \label{lst:SMPdesign:SEQ Pseudocode} \end{listing} -PWQ is based on SEQ, which is shown in -Listing~\ref{lst:SMPdesign:SEQ Pseudocode} -(pseudocode for \path{maze_seq.c}). -The maze is represented by a 2D array of cells and -a linear-array-based work queue named \co{->visited}. - \begin{lineref}[ln:SMPdesign:SEQ Pseudocode] Line~\lnref{initcell} visits the initial cell, and each iteration of the loop spanning \clnrefrange{loop:b}{loop:e} traverses passages headed by one cell. diff --git a/appendix/toyrcu/toyrcu.tex b/appendix/toyrcu/toyrcu.tex index 92bd5ae7..07801b57 100644 --- a/appendix/toyrcu/toyrcu.tex +++ b/appendix/toyrcu/toyrcu.tex @@ -157,13 +157,6 @@ in the next section. \section{Per-Thread Lock-Based RCU} \label{sec:app:toyrcu:Per-Thread Lock-Based RCU} -\begin{listing}[tbp] -\input{CodeSamples/defer/rcu_lock_percpu@lock_unlock.fcv}\vspace*{-11pt}\fvset{firstnumber=last} -\input{CodeSamples/defer/rcu_lock_percpu@xxxxxxxx}\fvset{firstnumber=auto} -\caption{Per-Thread Lock-Based RCU Implementation} -\label{lst:app:toyrcu:Per-Thread Lock-Based RCU Implementation} -\end{listing} - \cref{lst:app:toyrcu:Per-Thread Lock-Based RCU Implementation} (\path{rcu_lock_percpu.h} and \path{rcu_lock_percpu.c}) shows an implementation based on one lock per thread. @@ -175,6 +168,13 @@ Therefore, all RCU read-side critical sections running when \co{synchronize_rcu()} starts must have completed before \co{synchronize_rcu()} can return. +\begin{listing}[tbp] +\input{CodeSamples/defer/rcu_lock_percpu@lock_unlock.fcv}\vspace*{-11pt}\fvset{firstnumber=last} +\input{CodeSamples/defer/rcu_lock_percpu@xxxxxxxx}\fvset{firstnumber=auto} +\caption{Per-Thread Lock-Based RCU Implementation} +\label{lst:app:toyrcu:Per-Thread Lock-Based RCU Implementation} +\end{listing} + This implementation does have the virtue of permitting concurrent RCU readers, and does avoid the deadlock condition that can arise with a single global lock. @@ -256,13 +256,6 @@ the shortcomings of the lock-based implementation. \section{Simple Counter-Based RCU} \label{sec:app:toyrcu:Simple Counter-Based RCU} -\begin{listing}[tbp] -\input{CodeSamples/defer/rcu_rcg@lock_unlock.fcv}\vspace*{-11pt}\fvset{firstnumber=last} -\input{CodeSamples/defer/rcu_rcg@xxxxxxxx}\fvset{firstnumber=auto} -\caption{RCU Implementation Using Single Global Reference Counter} -\label{lst:app:toyrcu:RCU Implementation Using Single Global Reference Counter} -\end{listing} - A slightly more sophisticated RCU implementation is shown in \cref{lst:app:toyrcu:RCU Implementation Using Single Global Reference Counter} (\path{rcu_rcg.h} and \path{rcu_rcg.c}). @@ -284,6 +277,13 @@ Again, once \co{synchronize_rcu()} returns, all prior RCU read-side critical sections are guaranteed to have completed. \end{lineref} +\begin{listing}[tbp] +\input{CodeSamples/defer/rcu_rcg@lock_unlock.fcv}\vspace*{-11pt}\fvset{firstnumber=last} +\input{CodeSamples/defer/rcu_rcg@xxxxxxxx}\fvset{firstnumber=auto} +\caption{RCU Implementation Using Single Global Reference Counter} +\label{lst:app:toyrcu:RCU Implementation Using Single Global Reference Counter} +\end{listing} + In happy contrast to the lock-based implementation shown in \cref{sec:app:toyrcu:Lock-Based RCU}, this implementation allows parallel execution of RCU read-side critical sections. @@ -378,18 +378,6 @@ scheme that is more favorable to writers. \section{Starvation-Free Counter-Based RCU} \label{sec:app:toyrcu:Starvation-Free Counter-Based RCU} -\begin{listing}[tbp] -\input{CodeSamples/defer/rcu_rcpg@xxxxxxxxxx} -\caption{RCU Global Reference-Count Pair Data} -\label{lst:app:toyrcu:RCU Global Reference-Count Pair Data} -\end{listing} - -\begin{listing}[tbp] -\input{CodeSamples/defer/rcu_rcpg@xxxxx} -\caption{RCU Read-Side Using Global Reference-Count Pair} -\label{lst:app:toyrcu:RCU Read-Side Using Global Reference-Count Pair} -\end{listing} - \Cref{lst:app:toyrcu:RCU Read-Side Using Global Reference-Count Pair} (\path{rcu_rcpg.h}) shows the read-side primitives of an RCU implementation that uses a pair @@ -402,6 +390,18 @@ and a global lock (\co{rcu_gp_lock}), which are themselves shown in \cref{lst:app:toyrcu:RCU Global Reference-Count Pair Data}. +\begin{listing}[tbp] +\input{CodeSamples/defer/rcu_rcpg@xxxxxxxxxx} +\caption{RCU Global Reference-Count Pair Data} +\label{lst:app:toyrcu:RCU Global Reference-Count Pair Data} +\end{listing} + +\begin{listing}[tbp] +\input{CodeSamples/defer/rcu_rcpg@xxxxx} +\caption{RCU Read-Side Using Global Reference-Count Pair} +\label{lst:app:toyrcu:RCU Read-Side Using Global Reference-Count Pair} +\end{listing} + \paragraph{Design} It is the two-element \co{rcu_refcnt[]} array that provides the freedom @@ -659,18 +659,6 @@ scheme that provides greatly improved read-side performance and scalability. \section{Scalable Counter-Based RCU} \label{sec:app:toyrcu:Scalable Counter-Based RCU} -\begin{listing}[tb] -\input{CodeSamples/defer/rcu_rcpl@xxxxxxxxxx} -\caption{RCU Per-Thread Reference-Count Pair Data} -\label{lst:app:toyrcu:RCU Per-Thread Reference-Count Pair Data} -\end{listing} - -\begin{listing}[tb] -\input{CodeSamples/defer/rcu_rcpl@xxxxx} -\caption{RCU Read-Side Using Per-Thread Reference-Count Pair} -\label{lst:app:toyrcu:RCU Read-Side Using Per-Thread Reference-Count Pair} -\end{listing} - \Cref{lst:app:toyrcu:RCU Read-Side Using Per-Thread Reference-Count Pair} (\path{rcu_rcpl.h}) shows the read-side primitives of an RCU implementation that uses per-thread @@ -686,6 +674,18 @@ One benefit of per-thread \co{rcu_refcnt[]} array is that the \co{rcu_read_lock()} and \co{rcu_read_unlock()} primitives no longer perform atomic operations. +\begin{listing}[tb] +\input{CodeSamples/defer/rcu_rcpl@xxxxxxxxxx} +\caption{RCU Per-Thread Reference-Count Pair Data} +\label{lst:app:toyrcu:RCU Per-Thread Reference-Count Pair Data} +\end{listing} + +\begin{listing}[tb] +\input{CodeSamples/defer/rcu_rcpl@xxxxx} +\caption{RCU Read-Side Using Per-Thread Reference-Count Pair} +\label{lst:app:toyrcu:RCU Read-Side Using Per-Thread Reference-Count Pair} +\end{listing} + \QuickQuiz{} Come off it! We can see the \co{atomic_read()} primitive in @@ -798,18 +798,6 @@ concurrent RCU updates. \section{Scalable Counter-Based RCU With Shared Grace Periods} \label{sec:app:toyrcu:Scalable Counter-Based RCU With Shared Grace Periods} -\begin{listing}[tbp] -\input{CodeSamples/defer/rcu_rcpls@xxxxxxxxxx} -\caption{RCU Read-Side Using Per-Thread Reference-Count Pair and Shared Update Data} -\label{lst:app:toyrcu:RCU Read-Side Using Per-Thread Reference-Count Pair and Shared Update Data} -\end{listing} - -\begin{listing}[tbp] -\input{CodeSamples/defer/rcu_rcpls@xxxxx} -\caption{RCU Read-Side Using Per-Thread Reference-Count Pair and Shared Update} -\label{lst:app:toyrcu:RCU Read-Side Using Per-Thread Reference-Count Pair and Shared Update} -\end{listing} - \Cref{lst:app:toyrcu:RCU Read-Side Using Per-Thread Reference-Count Pair and Shared Update} (\path{rcu_rcpls.h}) shows the read-side primitives for an RCU implementation using per-thread @@ -831,9 +819,15 @@ with \co{rcu_idx} now being a \co{long} instead of an \end{lineref} \begin{listing}[tbp] -\input{CodeSamples/defer/rcu_rcpls@xxxxx} -\caption{RCU Shared Update Using Per-Thread Reference-Count Pair} -\label{lst:app:toyrcu:RCU Shared Update Using Per-Thread Reference-Count Pair} +\input{CodeSamples/defer/rcu_rcpls@xxxxxxxxxx} +\caption{RCU Read-Side Using Per-Thread Reference-Count Pair and Shared Update Data} +\label{lst:app:toyrcu:RCU Read-Side Using Per-Thread Reference-Count Pair and Shared Update Data} +\end{listing} + +\begin{listing}[tbp] +\input{CodeSamples/defer/rcu_rcpls@xxxxx} +\caption{RCU Read-Side Using Per-Thread Reference-Count Pair and Shared Update} +\label{lst:app:toyrcu:RCU Read-Side Using Per-Thread Reference-Count Pair and Shared Update} \end{listing} \Cref{lst:app:toyrcu:RCU Shared Update Using Per-Thread Reference-Count Pair} @@ -851,6 +845,12 @@ The differences in \co{flip_counter_and_wait()} include: \end{enumerate} \end{lineref} +\begin{listing}[tbp] +\input{CodeSamples/defer/rcu_rcpls@xxxxx} +\caption{RCU Shared Update Using Per-Thread Reference-Count Pair} +\label{lst:app:toyrcu:RCU Shared Update Using Per-Thread Reference-Count Pair} +\end{listing} + \begin{lineref}[ln:defer:rcu_rcpls:u:sync] The changes to \co{synchronize_rcu()} are more pervasive: \begin{enumerate} @@ -945,6 +945,12 @@ thread-local accesses to one, as is done in the next section. \section{RCU Based on Free-Running Counter} \label{sec:app:toyrcu:RCU Based on Free-Running Counter} +\Cref{lst:app:toyrcu:Free-Running Counter Using RCU} +(\path{rcu.h} and \path{rcu.c}) +shows an RCU implementation based on a single global free-running counter +that takes on only even-numbered values, with data shown in +\cref{lst:app:toyrcu:Data for Free-Running Counter Using RCU}. + \begin{listing}[tbp] \input{CodeSamples/defer/rcu@xxxxxxxxxx} \caption{Data for Free-Running Counter Using RCU} @@ -958,11 +964,6 @@ thread-local accesses to one, as is done in the next section. \label{lst:app:toyrcu:Free-Running Counter Using RCU} \end{listing} -\Cref{lst:app:toyrcu:Free-Running Counter Using RCU} -(\path{rcu.h} and \path{rcu.c}) -shows an RCU implementation based on a single global free-running counter -that takes on only even-numbered values, with data shown in -\cref{lst:app:toyrcu:Data for Free-Running Counter Using RCU}. The resulting \co{rcu_read_lock()} implementation is extremely straightforward. \begin{lineref}[ln:defer:rcu:read_lock_unlock:lock] @@ -1109,19 +1110,6 @@ variables. \section{Nestable RCU Based on Free-Running Counter} \label{sec:app:toyrcu:Nestable RCU Based on Free-Running Counter} -\begin{listing}[tb] -\input{CodeSamples/defer/rcu_nest@xxxxxxxxxx} -\caption{Data for Nestable RCU Using a Free-Running Counter} -\label{lst:app:toyrcu:Data for Nestable RCU Using a Free-Running Counter} -\end{listing} - -\begin{listing}[tb] -\input{CodeSamples/defer/rcu_nest@read_lock_unlock.fcv}\vspace*{-11pt}\fvset{firstnumber=last} -\input{CodeSamples/defer/rcu_nest@xxxxxxxxxxxxxxx}\fvset{firstnumber=auto} -\caption{Nestable RCU Using a Free-Running Counter} -\label{lst:app:toyrcu:Nestable RCU Using a Free-Running Counter} -\end{listing} - \Cref{lst:app:toyrcu:Nestable RCU Using a Free-Running Counter} (\path{rcu_nest.h} and \path{rcu_nest.c}) shows an RCU implementation based on a single global free-running counter, @@ -1148,6 +1136,19 @@ reserves seven bits, for a maximum RCU read-side critical-section nesting depth of 127, which should be well in excess of that needed by most applications. +\begin{listing}[tb] +\input{CodeSamples/defer/rcu_nest@xxxxxxxxxx} +\caption{Data for Nestable RCU Using a Free-Running Counter} +\label{lst:app:toyrcu:Data for Nestable RCU Using a Free-Running Counter} +\end{listing} + +\begin{listing}[tb] +\input{CodeSamples/defer/rcu_nest@read_lock_unlock.fcv}\vspace*{-11pt}\fvset{firstnumber=last} +\input{CodeSamples/defer/rcu_nest@xxxxxxxxxxxxxxx}\fvset{firstnumber=auto} +\caption{Nestable RCU Using a Free-Running Counter} +\label{lst:app:toyrcu:Nestable RCU Using a Free-Running Counter} +\end{listing} + \begin{lineref}[ln:defer:rcu_nest:read_lock_unlock:lock] The resulting \co{rcu_read_lock()} implementation is still reasonably straightforward. @@ -1349,18 +1350,6 @@ overhead. \section{RCU Based on Quiescent States} \label{sec:app:toyrcu:RCU Based on Quiescent States} -\begin{listing}[tbp] -\input{CodeSamples/defer/rcu_qs@xxxxxxxxxx} -\caption{Data for Quiescent-State-Based RCU} -\label{lst:app:toyrcu:Data for Quiescent-State-Based RCU} -\end{listing} - -\begin{listing}[tbp] -\input{CodeSamples/defer/rcu_qs@read_lock_unlock.fcv} -\caption{Quiescent-State-Based RCU Read Side} -\label{lst:app:toyrcu:Quiescent-State-Based RCU Read Side} -\end{listing} - \begin{lineref}[ln:defer:rcu_qs:read_lock_unlock] \Cref{lst:app:toyrcu:Quiescent-State-Based RCU Read Side} (\path{rcu_qs.h}) @@ -1398,6 +1387,18 @@ It is illegal to invoke \co{rcu_quiescent_state()}, \co{rcu_thread_offline()}, or \co{rcu_thread_online()} from an RCU read-side critical section. \end{lineref} +\begin{listing}[tbp] +\input{CodeSamples/defer/rcu_qs@xxxxxxxxxx} +\caption{Data for Quiescent-State-Based RCU} +\label{lst:app:toyrcu:Data for Quiescent-State-Based RCU} +\end{listing} + +\begin{listing}[tbp] +\input{CodeSamples/defer/rcu_qs@read_lock_unlock.fcv} +\caption{Quiescent-State-Based RCU Read Side} +\label{lst:app:toyrcu:Quiescent-State-Based RCU Read Side} +\end{listing} + \begin{lineref}[ln:defer:rcu_qs:read_lock_unlock:qs] In \co{rcu_quiescent_state()}, \clnref{mb1} executes a memory barrier to prevent any code prior to the quiescent state (including possible diff --git a/count/count.tex b/count/count.tex index 67d74bc5..7cd8eceb 100644 --- a/count/count.tex +++ b/count/count.tex @@ -929,13 +929,6 @@ comes at the cost of the additional thread running \co{eventual()}. \subsection{Per-Thread-Variable-Based Implementation} \label{sec:count:Per-Thread-Variable-Based Implementation} -\NoIndentAfterThis - -\begin{listing}[tb] -\input{CodeSamples/count/count_end@xxxxxxxxx} -\caption{Per-Thread Statistical Counters} -\label{lst:count:Per-Thread Statistical Counters} -\end{listing} Fortunately, \GCC\ provides an \co{__thread} storage class that provides per-thread storage. @@ -946,6 +939,12 @@ a statistical counter that not only scales, but that also incurs little or no performance penalty to incrementers compared to simple non-atomic increment. +\begin{listing}[tb] +\input{CodeSamples/count/count_end@xxxxxxxxx} +\caption{Per-Thread Statistical Counters} +\label{lst:count:Per-Thread Statistical Counters} +\end{listing} + \begin{lineref}[ln:count:count_end:whole] \Clnrefrange{var:b}{var:e} define needed variables: \co{counter} is the per-thread counter @@ -1316,20 +1315,6 @@ Section~\ref{sec:SMPdesign:Parallel Fastpath}. \subsection{Simple Limit Counter Implementation} \label{sec:count:Simple Limit Counter Implementation} -\NoIndentAfterThis - -\begin{listing}[tbp] -\input{CodeSamples/count/count_lim@xxxxxxxxxxxx} -\caption{Simple Limit Counter Variables} -\label{lst:count:Simple Limit Counter Variables} -\end{listing} - -\begin{figure}[tb] -\centering -\resizebox{2.5in}{!}{\includegraphics{count/count_lim}} -\caption{Simple Limit Counter Variable Relationships} -\label{fig:count:Simple Limit Counter Variable Relationships} -\end{figure} \begin{lineref}[ln:count:count_lim:variable] Listing~\ref{lst:count:Simple Limit Counter Variables} @@ -1359,6 +1344,19 @@ Figure~\ref{fig:count:Simple Limit Counter Variable Relationships}: that thread's \co{countermax}. \end{enumerate} +\begin{listing}[tbp] +\input{CodeSamples/count/count_lim@xxxxxxxxxxxx} +\caption{Simple Limit Counter Variables} +\label{lst:count:Simple Limit Counter Variables} +\end{listing} + +\begin{figure}[tb] +\centering +\resizebox{2.5in}{!}{\includegraphics{count/count_lim}} +\caption{Simple Limit Counter Variable Relationships} +\label{fig:count:Simple Limit Counter Variable Relationships} +\end{figure} + Each element of the \co{counterp[]} array references the corresponding thread's \co{counter} variable, and, finally, the \co{gblcnt_mutex} spinlock guards all of the global variables, in other words, no thread @@ -1375,7 +1373,6 @@ Listing~\ref{lst:count:Simple Limit Counter Add, Subtract, and Read} shows the \co{add_count()}, \co{sub_count()}, and \co{read_count()} functions (\path{count_lim.c}). - \QuickQuiz{} Why does Listing~\ref{lst:count:Simple Limit Counter Add, Subtract, and Read} @@ -1766,19 +1763,6 @@ This task is undertaken in the next section. \subsection{Approximate Limit Counter Implementation} \label{sec:count:Approximate Limit Counter Implementation} -\NoIndentAfterThis - -\begin{listing}[tbp] -\input{CodeSamples/count/count_lim_app@xxxxxxxxxxxx} -\caption{Approximate Limit Counter Variables} -\label{lst:count:Approximate Limit Counter Variables} -\end{listing} - -\begin{listing}[tbp] -\input{CodeSamples/count/count_lim_app@xxxxxxxxxxx} -\caption{Approximate Limit Counter Balancing} -\label{lst:count:Approximate Limit Counter Balancing} -\end{listing} Because this implementation (\path{count_lim_app.c}) is quite similar to that in the previous section @@ -1792,6 +1776,18 @@ Listing~\ref{lst:count:Simple Limit Counter Variables}, with the addition of \co{MAX_COUNTERMAX}, which sets the maximum permissible value of the per-thread \co{countermax} variable. +\begin{listing}[tbp] +\input{CodeSamples/count/count_lim_app@xxxxxxxxxxxx} +\caption{Approximate Limit Counter Variables} +\label{lst:count:Approximate Limit Counter Variables} +\end{listing} + +\begin{listing}[tbp] +\input{CodeSamples/count/count_lim_app@xxxxxxxxxxx} +\caption{Approximate Limit Counter Balancing} +\label{lst:count:Approximate Limit Counter Balancing} +\end{listing} + \begin{lineref}[ln:count:count_lim_app:balance] Similarly, Listing~\ref{lst:count:Approximate Limit Counter Balancing} @@ -2352,15 +2348,7 @@ The slowpath then sets that thread's \co{theft} state to IDLE. \subsection{Signal-Theft Limit Counter Implementation} \label{sec:count:Signal-Theft Limit Counter Implementation} -%\NoIndentAfterThis % @@@ Does not work as expected. @@@ - -\begin{listing}[tbp] -\input{CodeSamples/count/count_lim_sig@xxxxxxxx} -\caption{Signal-Theft Limit Counter Data} -\label{lst:count:Signal-Theft Limit Counter Data} -\end{listing} -\noindent% @@@ \NoIndentAfterThis above has side-effect @@@ \begin{lineref}[ln:count:count_lim_sig:data] Listing~\ref{lst:count:Signal-Theft Limit Counter Data} (\path{count_lim_sig.c}) @@ -2377,9 +2365,9 @@ and \co{theft} variables, respectively. \end{lineref} \begin{listing}[tbp] -\input{CodeSamples/count/count_lim_sig@xxxxxxxxxxxxx} -\caption{Signal-Theft Limit Counter Value-Migration Functions} -\label{lst:count:Signal-Theft Limit Counter Value-Migration Functions} +\input{CodeSamples/count/count_lim_sig@xxxxxxxx} +\caption{Signal-Theft Limit Counter Data} +\label{lst:count:Signal-Theft Limit Counter Data} \end{listing} \begin{lineref}[ln:count:count_lim_sig:migration:globalize] @@ -2405,6 +2393,12 @@ this thread's fastpaths are not running, line~\lnref{set:READY} sets the \co{the state to READY. \end{lineref} +\begin{listing}[tbp] +\input{CodeSamples/count/count_lim_sig@xxxxxxxxxxxxx} +\caption{Signal-Theft Limit Counter Value-Migration Functions} +\label{lst:count:Signal-Theft Limit Counter Value-Migration Functions} +\end{listing} + \QuickQuiz{} In Listing~\ref{lst:count:Signal-Theft Limit Counter Value-Migration Functions} function \co{flush_local_count_sig()}, why are there diff --git a/datastruct/datastruct.tex b/datastruct/datastruct.tex index 6aa982eb..04998ee3 100644 --- a/datastruct/datastruct.tex +++ b/datastruct/datastruct.tex @@ -149,20 +149,6 @@ offers excellent scalability. \subsection{Hash-Table Implementation} \label{sec:datastruct:Hash-Table Implementation} -\NoIndentAfterThis - -\begin{listing}[tb] -\input{CodeSamples/datastruct/hash/hash_bkt@xxxxxxxxxx} -\caption{Hash-Table Data Structures} -\label{lst:datastruct:Hash-Table Data Structures} -\end{listing} - -\begin{figure}[tb] -\centering -\resizebox{3in}{!}{\includegraphics{datastruct/hashdiagram}} -\caption{Hash-Table Data-Structure Diagram} -\label{fig:datastruct:Hash-Table Data-Structure Diagram} -\end{figure} \begin{lineref}[ln:datastruct:hash_bkt:struct] Listing~\ref{lst:datastruct:Hash-Table Data Structures} @@ -192,16 +178,23 @@ being placed in the hash table, and this larger structure might contain a complex key. \end{lineref} +\begin{listing}[tb] +\input{CodeSamples/datastruct/hash/hash_bkt@xxxxxxxxxx} +\caption{Hash-Table Data Structures} +\label{lst:datastruct:Hash-Table Data Structures} +\end{listing} + +\begin{figure}[tb] +\centering +\resizebox{3in}{!}{\includegraphics{datastruct/hashdiagram}} +\caption{Hash-Table Data-Structure Diagram} +\label{fig:datastruct:Hash-Table Data-Structure Diagram} +\end{figure} + The diagram shown in Figure~\ref{fig:datastruct:Hash-Table Data-Structure Diagram} has bucket~0 with two elements and bucket~2 with one. -\begin{listing}[tb] -\input{CodeSamples/datastruct/hash/hash_bkt@map_lock.fcv} -\caption{Hash-Table Mapping and Locking} -\label{lst:datastruct:Hash-Table Mapping and Locking} -\end{listing} - \begin{lineref}[ln:datastruct:hash_bkt:map_lock:map] Listing~\ref{lst:datastruct:Hash-Table Mapping and Locking} shows mapping and locking functions. @@ -215,9 +208,9 @@ corresponding to the specified hash value. \end{lineref} \begin{listing}[tb] -\input{CodeSamples/datastruct/hash/hash_bkt@xxxxxxxxxx} -\caption{Hash-Table Lookup} -\label{lst:datastruct:Hash-Table Lookup} +\input{CodeSamples/datastruct/hash/hash_bkt@map_lock.fcv} +\caption{Hash-Table Mapping and Locking} +\label{lst:datastruct:Hash-Table Mapping and Locking} \end{listing} \begin{lineref}[ln:datastruct:hash_bkt:lookup] @@ -241,6 +234,12 @@ line~\lnref{return} returns a pointer to the matching element. If no element matches, line~\lnref{ret_NULL} returns \co{NULL}. \end{lineref} +\begin{listing}[tb] +\input{CodeSamples/datastruct/hash/hash_bkt@xxxxxxxxxx} +\caption{Hash-Table Lookup} +\label{lst:datastruct:Hash-Table Lookup} +\end{listing} + \QuickQuiz{} \begin{lineref}[ln:datastruct:hash_bkt:lookup] But isn't the double comparison on @@ -482,13 +481,6 @@ section~\cite{McKenney:2013:SDS:2483852.2483867}. \subsection{RCU-Protected Hash Table Implementation} \label{sec:datastruct:RCU-Protected Hash Table Implementation} -\NoIndentAfterThis - -\begin{listing}[tb] -\input{CodeSamples/datastruct/hash/hash_bkt_rcu@lock_unlock.fcv} -\caption{RCU-Protected Hash-Table Read-Side Concurrency Control} -\label{lst:datastruct:RCU-Protected Hash-Table Read-Side Concurrency Control} -\end{listing} For an RCU-protected hash table with per-bucket locking, updaters use locking exactly as described in @@ -505,9 +497,9 @@ shown in Listing~\ref{lst:datastruct:RCU-Protected Hash-Table Read-Side Concurrency Control}. \begin{listing}[tb] -\input{CodeSamples/datastruct/hash/hash_bkt_rcu@xxxxxxxxxx} -\caption{RCU-Protected Hash-Table Lookup} -\label{lst:datastruct:RCU-Protected Hash-Table Lookup} +\input{CodeSamples/datastruct/hash/hash_bkt_rcu@lock_unlock.fcv} +\caption{RCU-Protected Hash-Table Read-Side Concurrency Control} +\label{lst:datastruct:RCU-Protected Hash-Table Read-Side Concurrency Control} \end{listing} Listing~\ref{lst:datastruct:RCU-Protected Hash-Table Lookup} @@ -531,6 +523,12 @@ RCU read-side critical section, for example, the caller must invoke \co{hashtab_lock_lookup()} before invoking \co{hashtab_lookup()} (and of course invoke \co{hashtab_unlock_lookup()} some time afterwards). +\begin{listing}[tb] +\input{CodeSamples/datastruct/hash/hash_bkt_rcu@xxxxxxxxxx} +\caption{RCU-Protected Hash-Table Lookup} +\label{lst:datastruct:RCU-Protected Hash-Table Lookup} +\end{listing} + \QuickQuiz{} But if elements in a hash table can be deleted concurrently with lookups, doesn't that mean that a lookup could return @@ -893,13 +891,6 @@ which is the subject of the next section. \subsection{Resizable Hash Table Implementation} \label{sec:datastruct:Resizable Hash Table Implementation} -\NoIndentAfterThis - -\begin{listing}[tb] -\input{CodeSamples/datastruct/hash/hash_resize@xxxxxxxx} -\caption{Resizable Hash-Table Data Structures} -\label{lst:datastruct:Resizable Hash-Table Data Structures} -\end{listing} \begin{lineref}[ln:datastruct:hash_resize:data] Resizing is accomplished by the classic approach of inserting a level @@ -917,6 +908,12 @@ from both performance and scalability viewpoints. However, because resize operations should be relatively infrequent, we should be able to make good use of RCU. +\begin{listing}[tb] +\input{CodeSamples/datastruct/hash/hash_resize@xxxxxxxx} +\caption{Resizable Hash-Table Data Structures} +\label{lst:datastruct:Resizable Hash-Table Data Structures} +\end{listing} + The \co{ht} structure represents a specific size of the hash table, as specified by the \co{->ht_nbuckets} field on line~\lnref{ht:nbuckets}. The size is stored in the same structure containing the array of diff --git a/defer/rcuapi.tex b/defer/rcuapi.tex index 9065a1a1..12bfd9b6 100644 --- a/defer/rcuapi.tex +++ b/defer/rcuapi.tex @@ -26,7 +26,22 @@ presents concluding remarks. \subsubsection{RCU has a Family of Wait-to-Finish APIs} \label{sec:defer:RCU has a Family of Wait-to-Finish APIs} -\begin{table*}[htbp] +The most straightforward answer to ``what is RCU'' is that RCU is +an API. +For example, the RCU implementation used in the Linux kernel is +summarized by +Table~\ref{tab:defer:RCU Wait-to-Finish APIs}, +which shows the wait-for-readers portions of the RCU, ``sleepable'' RCU +(SRCU), Tasks RCU, and generic APIs, respectively, +and by +Table~\ref{tab:defer:RCU Publish-Subscribe and Version Maintenance APIs}, +which shows the publish-subscribe portions of the +API~\cite{PaulEMcKenney2019RCUAPI}.\footnote{ + This citation covers v4.20 and later. + Documetation for earlier versions of the Linux-kernel RCU API may + be found elsewhere~\cite{PaulEMcKenney2008WhatIsRCUAPI,PaulEMcKenney2014RCUAPI}.} + +\begin{table*}[tbp] \rowcolors{1}{}{lightgray} \renewcommand*{\arraystretch}{1.3} \centering @@ -101,21 +116,6 @@ presents concluding remarks. \end{tabularx} \end{table*} -The most straightforward answer to ``what is RCU'' is that RCU is -an API. -For example, the RCU implementation used in the Linux kernel is -summarized by -Table~\ref{tab:defer:RCU Wait-to-Finish APIs}, -which shows the wait-for-readers portions of the RCU, ``sleepable'' RCU -(SRCU), Tasks RCU, and generic APIs, respectively, -and by -Table~\ref{tab:defer:RCU Publish-Subscribe and Version Maintenance APIs}, -which shows the publish-subscribe portions of the -API~\cite{PaulEMcKenney2019RCUAPI}.\footnote{ - This citation covers v4.20 and later. - Documetation for earlier versions of the Linux-kernel RCU API may - be found elsewhere~\cite{PaulEMcKenney2008WhatIsRCUAPI,PaulEMcKenney2014RCUAPI}.} - If you are new to RCU, you might consider focusing on just one of the columns in Table~\ref{tab:defer:RCU Wait-to-Finish APIs}, diff --git a/defer/rcuusage.tex b/defer/rcuusage.tex index 34ae77d0..8082a26c 100644 --- a/defer/rcuusage.tex +++ b/defer/rcuusage.tex @@ -44,7 +44,13 @@ Section~\ref{sec:defer:RCU Usage Summary} provides a summary. \subsubsection{RCU for Pre-BSD Routing} \label{sec:defer:RCU for Pre-BSD Routing} -\NoIndentAfterThis + +Listings~\ref{lst:defer:RCU Pre-BSD Routing Table Lookup} +and~\ref{lst:defer:RCU Pre-BSD Routing Table Add/Delete} +show code for an RCU-protected Pre-BSD routing table +(\path{route_rcu.c}). +The former shows data structures and \co{route_lookup()}, +and the latter shows \co{route_add()} and \co{route_del()}. \begin{listing}[tbp] \input{CodeSamples/defer/route_rcu@xxxxxxxxxx} @@ -58,13 +64,6 @@ Section~\ref{sec:defer:RCU Usage Summary} provides a summary. \label{lst:defer:RCU Pre-BSD Routing Table Add/Delete} \end{listing} -Listings~\ref{lst:defer:RCU Pre-BSD Routing Table Lookup} -and~\ref{lst:defer:RCU Pre-BSD Routing Table Add/Delete} -show code for an RCU-protected Pre-BSD routing table -(\path{route_rcu.c}). -The former shows data structures and \co{route_lookup()}, -and the latter shows \co{route_add()} and \co{route_del()}. - \begin{lineref}[ln:defer:route_rcu:lookup] In Listing~\ref{lst:defer:RCU Pre-BSD Routing Table Lookup}, line~\lnref{rh} adds the \co{->rh} field used by RCU reclamation, diff --git a/formal/axiomatic.tex b/formal/axiomatic.tex index 9d7bb772..368e257f 100644 --- a/formal/axiomatic.tex +++ b/formal/axiomatic.tex @@ -139,12 +139,6 @@ This is now possible, as will be described in the following sections. \subsection{Axiomatic Approaches and Locking} \label{sec:formal:Axiomatic Approaches and Locking} -\begin{listing}[tb] -\input{CodeSamples/formal/herd/C-Lock1@xxxxxxxxx} -\caption{Locking Example} -\label{lst:formal:Locking Example} -\end{listing} - Axiomatic approaches may also be applied to higher-level languages and also to higher-level synchronization primitives, as exemplified by the lock-based litmus test shown in @@ -160,6 +154,12 @@ of one.\footnote{ and~\ref{lst:formal:PPCMEM on Repaired Litmus Test} for examples showing the output format.} +\begin{listing}[tb] +\input{CodeSamples/formal/herd/C-Lock1@xxxxxxxxx} +\caption{Locking Example} +\label{lst:formal:Locking Example} +\end{listing} + \QuickQuiz{} What do you have to do to run \co{herd} on litmus tests like that shown in Listing~\ref{lst:formal:Locking Example}? @@ -254,12 +254,6 @@ The next section looks at RCU. \subsection{Axiomatic Approaches and RCU} \label{sec:formal:Axiomatic Approaches and RCU} -\begin{listing}[tb] -\input{CodeSamples/formal/herd/C-RCU-remove@xxxxxxxxx} -\caption{Canonical RCU Removal Litmus Test} -\label{lst:formal:Canonical RCU Removal Litmus Test} -\end{listing} - \begin{lineref}[ln:formal:C-RCU-remove:whole] Axiomatic approaches can also analyze litmus tests involving RCU~\cite{Alglave:2018:FSC:3173162.3177156}. @@ -275,6 +269,12 @@ Line~\lnref{head} shows \co{x} as the list head, initially referencing \co{y}, which in turn is initialized to the value \co{2} on line~\lnref{tail:1}. +\begin{listing}[tb] +\input{CodeSamples/formal/herd/C-RCU-remove@xxxxxxxxx} +\caption{Canonical RCU Removal Litmus Test} +\label{lst:formal:Canonical RCU Removal Litmus Test} +\end{listing} + \co{P0()} on \clnrefrange{P0start}{P0end} removes element \co{y} from the list by replacing it with element \co{z} (line~\lnref{assignnewtail}), diff --git a/formal/dyntickrcu.tex b/formal/dyntickrcu.tex index bc2e7ee8..41ca46e9 100644 --- a/formal/dyntickrcu.tex +++ b/formal/dyntickrcu.tex @@ -321,19 +321,19 @@ preemptible RCU's grace-period machinery. \subsubsection{Grace-Period Interface} \label{sec:formal:Grace-Period Interface} -\begin{figure}[htb] -\centering -\resizebox{3in}{!}{\includegraphics{formal/RCUpreemptStates}} -\caption{Preemptible RCU State Machine} -\label{fig:formal:Preemptible RCU State Machine} -\end{figure} - Of the four preemptible RCU grace-period states shown in \cref{fig:formal:Preemptible RCU State Machine}, only the \co{rcu_try_flip_waitack_state} and \co{rcu_try_flip_waitmb_state} states need to wait for other CPUs to respond. +\begin{figure}[tb] +\centering +\resizebox{3in}{!}{\includegraphics{formal/RCUpreemptStates}} +\caption{Preemptible RCU State Machine} +\label{fig:formal:Preemptible RCU State Machine} +\end{figure} + Of course, if a given CPU is in dynticks-idle state, we shouldn't wait for it. Therefore, just before entering one of these two states, @@ -1193,7 +1193,13 @@ development cycle~\cite{PaulEMcKenney2008commit:64db4cfff99c}. \subsubsection{State Variables for Simplified Dynticks Interface} \label{sec:formal:State Variables for Simplified Dynticks Interface} -\NoIndentAfterThis + +\Cref{lst:formal:Variables for Simple Dynticks Interface} +shows the new per-CPU state variables. +These variables are grouped into structs to allow multiple independent +RCU implementations (e.g., \co{rcu} and \co{rcu_bh}) to conveniently +and efficiently share dynticks state. +In what follows, they can be thought of as independent per-CPU variables. \begin{listing}[tbp] \begin{VerbatimL} @@ -1214,13 +1220,6 @@ struct rcu_data { \label{lst:formal:Variables for Simple Dynticks Interface} \end{listing} -\Cref{lst:formal:Variables for Simple Dynticks Interface} -shows the new per-CPU state variables. -These variables are grouped into structs to allow multiple independent -RCU implementations (e.g., \co{rcu} and \co{rcu_bh}) to conveniently -and efficiently share dynticks state. -In what follows, they can be thought of as independent per-CPU variables. - The \co{dynticks_nesting}, \co{dynticks}, and \co{dynticks_snap} variables are for the \IRQ\ code paths, and the \co{dynticks_nmi} and \co{dynticks_nmi_snap} variables are for the NMI code paths, although @@ -1276,7 +1275,11 @@ passed through a quiescent state during that interval. \subsubsection{Entering and Leaving Dynticks-Idle Mode} \label{sec:formal:Entering and Leaving Dynticks-Idle Mode} -\NoIndentAfterThis + +\Cref{lst:formal:Entering and Exiting Dynticks-Idle Mode} +shows the \co{rcu_enter_nohz()} and \co{rcu_exit_nohz()}, +which enter and exit dynticks-idle mode, also known as ``nohz'' mode. +These two functions are invoked from process context. \begin{listing}[tbp] \begin{linelabel}[ln:formal:Entering and Exiting Dynticks-Idle Mode] @@ -1314,11 +1317,6 @@ void rcu_exit_nohz(void) \label{lst:formal:Entering and Exiting Dynticks-Idle Mode} \end{listing} -\Cref{lst:formal:Entering and Exiting Dynticks-Idle Mode} -shows the \co{rcu_enter_nohz()} and \co{rcu_exit_nohz()}, -which enter and exit dynticks-idle mode, also known as ``nohz'' mode. -These two functions are invoked from process context. - \begin{lineref}[ln:formal:Entering and Exiting Dynticks-Idle Mode] \Clnref{mb} ensures that any prior memory accesses (which might include accesses from RCU read-side critical sections) are seen @@ -1339,7 +1337,23 @@ the opposite \co{dynticks} polarity. \subsubsection{NMIs From Dynticks-Idle Mode} \label{sec:formal:NMIs From Dynticks-Idle Mode} -\NoIndentAfterThis + +\begin{lineref}[ln:formal:NMIs From Dynticks-Idle Mode] +\Cref{lst:formal:NMIs From Dynticks-Idle Mode} +shows the \co{rcu_nmi_enter()} and \co{rcu_nmi_exit()} functions, +which inform RCU of NMI entry and exit, respectively, from dynticks-idle +mode. +However, if the NMI arrives during an \IRQ\ handler, then RCU will already +be on the lookout for RCU read-side critical sections from this CPU, +so \clnref{chk_ext1,ret1} of \co{rcu_nmi_enter()} and \clnref{chk_ext2,ret2} +of \co{rcu_nmi_exit()} silently return if \co{dynticks} is odd. +Otherwise, the two functions increment \co{dynticks_nmi}, with +\co{rcu_nmi_enter()} leaving it with an odd value and \co{rcu_nmi_exit()} +leaving it with an even value. +Both functions execute memory barriers between this increment +and possible RCU read-side critical sections on \clnref{mb1,mb2}, +respectively. +\end{lineref} \begin{listing}[tbp] \begin{linelabel}[ln:formal:NMIs From Dynticks-Idle Mode] @@ -1373,26 +1387,23 @@ void rcu_nmi_exit(void) \label{lst:formal:NMIs From Dynticks-Idle Mode} \end{listing} -\begin{lineref}[ln:formal:NMIs From Dynticks-Idle Mode] -\Cref{lst:formal:NMIs From Dynticks-Idle Mode} -shows the \co{rcu_nmi_enter()} and \co{rcu_nmi_exit()} functions, -which inform RCU of NMI entry and exit, respectively, from dynticks-idle -mode. -However, if the NMI arrives during an \IRQ\ handler, then RCU will already -be on the lookout for RCU read-side critical sections from this CPU, -so \clnref{chk_ext1,ret1} of \co{rcu_nmi_enter()} and \clnref{chk_ext2,ret2} -of \co{rcu_nmi_exit()} silently return if \co{dynticks} is odd. -Otherwise, the two functions increment \co{dynticks_nmi}, with -\co{rcu_nmi_enter()} leaving it with an odd value and \co{rcu_nmi_exit()} -leaving it with an even value. -Both functions execute memory barriers between this increment -and possible RCU read-side critical sections on \clnref{mb1,mb2}, -respectively. -\end{lineref} - \subsubsection{Interrupts From Dynticks-Idle Mode} \label{sec:formal:Interrupts From Dynticks-Idle Mode} -\NoIndentAfterThis + +\begin{lineref}[ln:formal:Interrupts From Dynticks-Idle Mode] +\Cref{lst:formal:Interrupts From Dynticks-Idle Mode} +shows \co{rcu_irq_enter()} and \co{rcu_irq_exit()}, which +inform RCU of entry to and exit from, respectively, \IRQ\ context. +\Clnref{inc_nst} of \co{rcu_irq_enter()} increments \co{dynticks_nesting}, +and if this variable was already non-zero, \clnref{ret1} silently returns. +Otherwise, \clnref{inc_dynt1} increments \co{dynticks}, +which will then have +an odd value, consistent with the fact that this CPU can now +execute RCU read-side critical sections. +\Clnref{mb1} therefore executes a memory barrier to ensure that +the increment of \co{dynticks} is seen before any +RCU read-side critical sections that the subsequent \IRQ\ handler +might execute. \begin{listing}[tbp] \begin{linelabel}[ln:formal:Interrupts From Dynticks-Idle Mode] @@ -1429,21 +1440,6 @@ void rcu_irq_exit(void) \label{lst:formal:Interrupts From Dynticks-Idle Mode} \end{listing} -\begin{lineref}[ln:formal:Interrupts From Dynticks-Idle Mode] -\Cref{lst:formal:Interrupts From Dynticks-Idle Mode} -shows \co{rcu_irq_enter()} and \co{rcu_irq_exit()}, which -inform RCU of entry to and exit from, respectively, \IRQ\ context. -\Clnref{inc_nst} of \co{rcu_irq_enter()} increments \co{dynticks_nesting}, -and if this variable was already non-zero, \clnref{ret1} silently returns. -Otherwise, \clnref{inc_dynt1} increments \co{dynticks}, -which will then have -an odd value, consistent with the fact that this CPU can now -execute RCU read-side critical sections. -\Clnref{mb1} therefore executes a memory barrier to ensure that -the increment of \co{dynticks} is seen before any -RCU read-side critical sections that the subsequent \IRQ\ handler -might execute. - \Clnref{dec_nst} of \co{rcu_irq_exit()} decrements \co{dynticks_nesting}, and if the result is non-zero, \clnref{ret2} silently returns. @@ -1461,7 +1457,25 @@ a reschedule API if so. \subsubsection{Checking For Dynticks Quiescent States} \label{sec:formal:Checking For Dynticks Quiescent States} -%\NoIndentAfterThis % @@@ This doesn't work as expected @@@ + +\begin{lineref}[ln:formal:Saving Dyntick Progress Counters] +\Cref{lst:formal:Saving Dyntick Progress Counters} +shows \co{dyntick_save_progress_counter()}, which takes a snapshot +of the specified CPU's \co{dynticks} and \co{dynticks_nmi} +counters. +\Clnref{snap,snapn} snapshot these two variables to locals, \clnref{mb} +executes a memory barrier to pair with the memory barriers in the functions in +\cref{lst:formal:Entering and Exiting Dynticks-Idle Mode,% +lst:formal:NMIs From Dynticks-Idle Mode,% +lst:formal:Interrupts From Dynticks-Idle Mode}. +\Clnref{rec_snap,rec_snapn} record the snapshots for later calls to +\co{rcu_implicit_dynticks_qs()}, +and \clnref{chk_prgs} checks to see if the CPU is in dynticks-idle mode with +neither \IRQ s nor NMIs in progress (in other words, both snapshots +have even values), hence in an extended quiescent state. +If so, \clnref{cnt:b,cnt:e} count this event, and \clnref{ret} returns +true if the CPU was in a quiescent state. +\end{lineref} \begin{listing}[tbp] \begin{linelabel}[ln:formal:Saving Dyntick Progress Counters] @@ -1489,24 +1503,33 @@ dyntick_save_progress_counter(struct rcu_data *rdp) \label{lst:formal:Saving Dyntick Progress Counters} \end{listing} -\noindent% @@@ \NoIndentAfterThis commented out above has side effect. @@@ -\begin{lineref}[ln:formal:Saving Dyntick Progress Counters] -\Cref{lst:formal:Saving Dyntick Progress Counters} -shows \co{dyntick_save_progress_counter()}, which takes a snapshot -of the specified CPU's \co{dynticks} and \co{dynticks_nmi} -counters. -\Clnref{snap,snapn} snapshot these two variables to locals, \clnref{mb} -executes a memory barrier to pair with the memory barriers in the functions in +\begin{lineref}[ln:formal:Checking Dyntick Progress Counters] +\Cref{lst:formal:Checking Dyntick Progress Counters} +shows \co{rcu_implicit_dynticks_qs()}, which is called to check +whether a CPU has entered dyntick-idle mode subsequent to a call +to \co{dynticks_save_progress_counter()}. +\Clnref{curr,currn} take new snapshots of the corresponding CPU's +\co{dynticks} and \co{dynticks_nmi} variables, while +\clnref{snap,snapn} retrieve the snapshots saved earlier by +\co{dynticks_save_progress_counter()}. +\Clnref{mb} then +executes a memory barrier to pair with the memory barriers in +the functions in \cref{lst:formal:Entering and Exiting Dynticks-Idle Mode,% lst:formal:NMIs From Dynticks-Idle Mode,% lst:formal:Interrupts From Dynticks-Idle Mode}. -\Clnref{rec_snap,rec_snapn} record the snapshots for later calls to -\co{rcu_implicit_dynticks_qs()}, -and \clnref{chk_prgs} checks to see if the CPU is in dynticks-idle mode with -neither \IRQ s nor NMIs in progress (in other words, both snapshots -have even values), hence in an extended quiescent state. -If so, \clnref{cnt:b,cnt:e} count this event, and \clnref{ret} returns -true if the CPU was in a quiescent state. +\Clnrefrange{chk_q:b}{chk_q:e} +then check to see if the CPU is either currently in +a quiescent state (\co{curr} and \co{curr_nmi} having even values) or +has passed through a quiescent state since the last call to +\co{dynticks_save_progress_counter()} (the values of +\co{dynticks} and \co{dynticks_nmi} having changed). +If these checks confirm that the CPU has passed through a dyntick-idle +quiescent state, then \clnref{cnt} counts that fact and +\clnref{ret_1} returns an indication of this fact. +Either way, \clnref{chk_race} +checks for race conditions that can result in RCU +waiting for a CPU that is offline. \end{lineref} \begin{listing}[tbp] @@ -1538,35 +1561,6 @@ rcu_implicit_dynticks_qs(struct rcu_data *rdp) \label{lst:formal:Checking Dyntick Progress Counters} \end{listing} -\begin{lineref}[ln:formal:Checking Dyntick Progress Counters] -\Cref{lst:formal:Checking Dyntick Progress Counters} -shows \co{rcu_implicit_dynticks_qs()}, which is called to check -whether a CPU has entered dyntick-idle mode subsequent to a call -to \co{dynticks_save_progress_counter()}. -\Clnref{curr,currn} take new snapshots of the corresponding CPU's -\co{dynticks} and \co{dynticks_nmi} variables, while -\clnref{snap,snapn} retrieve the snapshots saved earlier by -\co{dynticks_save_progress_counter()}. -\Clnref{mb} then -executes a memory barrier to pair with the memory barriers in -the functions in -\cref{lst:formal:Entering and Exiting Dynticks-Idle Mode,% -lst:formal:NMIs From Dynticks-Idle Mode,% -lst:formal:Interrupts From Dynticks-Idle Mode}. -\Clnrefrange{chk_q:b}{chk_q:e} -then check to see if the CPU is either currently in -a quiescent state (\co{curr} and \co{curr_nmi} having even values) or -has passed through a quiescent state since the last call to -\co{dynticks_save_progress_counter()} (the values of -\co{dynticks} and \co{dynticks_nmi} having changed). -If these checks confirm that the CPU has passed through a dyntick-idle -quiescent state, then \clnref{cnt} counts that fact and -\clnref{ret_1} returns an indication of this fact. -Either way, \clnref{chk_race} -checks for race conditions that can result in RCU -waiting for a CPU that is offline. -\end{lineref} - \QuickQuiz{} This is still pretty complicated. Why not just have a \co{cpumask_t} that has a bit set for diff --git a/formal/ppcmem.tex b/formal/ppcmem.tex index 2e9264fc..722bc4d1 100644 --- a/formal/ppcmem.tex +++ b/formal/ppcmem.tex @@ -58,6 +58,13 @@ discusses the implications. \subsection{Anatomy of a Litmus Test} \label{sec:formal:Anatomy of a Litmus Test} +An example PowerPC litmus test for PPCMEM is shown in +\cref{lst:formal:PPCMEM Litmus Test}. +The ARM interface works exactly the same way, but with ARM instructions +substituted for the Power instructions and with the initial ``PPC'' +replaced by ``ARM''. You can select the ARM interface by clicking on +``Change to ARM Model'' at the web page called out above. + \begin{listing}[tbp] \begin{linelabel}[ln:formal:PPCMEM Litmus Test] \begin{VerbatimL}[commandchars=\@\[\]] @@ -87,13 +94,6 @@ exists @lnlbl[assert:b] \label{lst:formal:PPCMEM Litmus Test} \end{listing} -An example PowerPC litmus test for PPCMEM is shown in -\cref{lst:formal:PPCMEM Litmus Test}. -The ARM interface works exactly the same way, but with ARM instructions -substituted for the Power instructions and with the initial ``PPC'' -replaced by ``ARM''. You can select the ARM interface by clicking on -``Change to ARM Model'' at the web page called out above. - \begin{lineref}[ln:formal:PPCMEM Litmus Test] In the example, \clnref{type} identifies the type of system (``ARM'' or ``PPC'') and contains the title for the model. \Clnref{altname} diff --git a/formal/spinhint.tex b/formal/spinhint.tex index cc555dce..1828ea3f 100644 --- a/formal/spinhint.tex +++ b/formal/spinhint.tex @@ -62,13 +62,6 @@ more complex uses. \subsubsection{Promela Warm-Up: Non-Atomic Increment} \label{sec:formal:Promela Warm-Up: Non-Atomic Increment} -\NoIndentAfterThis - -\begin{listing}[tbp] -\input{CodeSamples/formal/promela/increment@xxxxxxxxx} -\caption{Promela Code for Non-Atomic Increment} -\label{lst:formal:Promela Code for Non-Atomic Increment} -\end{listing} \begin{lineref}[ln:formal:promela:increment:whole] Listing~\ref{lst:formal:Promela Code for Non-Atomic Increment} @@ -79,6 +72,12 @@ to see the effect on state space), line~\lnref{count} defines the counter, and line~\lnref{prog} is used to implement the assertion that appears on \clnrefrange{assert:b}{assert:e}. +\begin{listing}[tbp] +\input{CodeSamples/formal/promela/increment@xxxxxxxxx} +\caption{Promela Code for Non-Atomic Increment} +\label{lst:formal:Promela Code for Non-Atomic Increment} +\end{listing} + \Clnrefrange{proc:b}{proc:e} define a process that increments the counter non-atomically. The argument \co{me} is the process number, set by the initialization @@ -175,30 +174,34 @@ The assertion then triggered, after which the global state is displayed. \subsubsection{Promela Warm-Up: Atomic Increment} \label{sec:formal:Promela Warm-Up: Atomic Increment} -\NoIndentAfterThis -\begin{listing}[htbp] +It is easy to fix this example by placing the body of the incrementer +processes in an atomic block as shown in +Listing~\ref{lst:formal:Promela Code for Atomic Increment}. +One could also have simply replaced the pair of statements with +\co{counter = counter + 1}, because Promela statements are +atomic. +Either way, running this modified model gives us an error-free traversal +of the state space, as shown in +Listing~\ref{lst:formal:Atomic Increment Spin Output}. + +\begin{listing}[tbp] \input{CodeSamples/formal/promela/atomicincrement@xxxxxxxxxxxxxxx} \caption{Promela Code for Atomic Increment} \label{lst:formal:Promela Code for Atomic Increment} \end{listing} -\begin{listing}[htbp] +\begin{listing}[tbp] \VerbatimInput[numbers=none,fontsize=\scriptsize]{CodeSamples/formal/promela/atomicincrement.spin.lst} \vspace*{-9pt} \caption{Atomic Increment Spin Output} \label{lst:formal:Atomic Increment Spin Output} \end{listing} -It is easy to fix this example by placing the body of the incrementer -processes in an atomic block as shown in -Listing~\ref{lst:formal:Promela Code for Atomic Increment}. -One could also have simply replaced the pair of statements with -\co{counter = counter + 1}, because Promela statements are -atomic. -Either way, running this modified model gives us an error-free traversal -of the state space, as shown in -Listing~\ref{lst:formal:Atomic Increment Spin Output}. +Table~\ref{tab:advsync:Memory Usage of Increment Model} +shows the number of states and memory consumed +as a function of number of incrementers modeled +(by redefining \co{NUMPROCS}): \begin{table} \rowcolors{1}{}{lightgray} @@ -224,11 +227,6 @@ Listing~\ref{lst:formal:Atomic Increment Spin Output}. \label{tab:advsync:Memory Usage of Increment Model} \end{table} -Table~\ref{tab:advsync:Memory Usage of Increment Model} -shows the number of states and memory consumed -as a function of number of incrementers modeled -(by redefining \co{NUMPROCS}): - Running unnecessarily large models is thus subtly discouraged, although 882\,MB is well within the limits of modern desktop and laptop machines. @@ -467,13 +465,6 @@ Now we are ready for more complex examples. \subsection{Promela Example: Locking} \label{sec:formal:Promela Example: Locking} -\NoIndentAfterThis - -\begin{listing}[tbp] -\input{CodeSamples/formal/promela/lock@xxxxxxxxx} -\caption{Promela Code for Spinlock} -\label{lst:formal:Promela Code for Spinlock} -\end{listing} \begin{lineref}[ln:formal:promela:lock:whole] Since locks are generally useful, \co{spin_lock()} and @@ -498,6 +489,12 @@ atomic block so as to take another pass through the outer loop, repeating until the lock is available. \end{lineref} +\begin{listing}[tbp] +\input{CodeSamples/formal/promela/lock@xxxxxxxxx} +\caption{Promela Code for Spinlock} +\label{lst:formal:Promela Code for Spinlock} +\end{listing} + The \co{spin_unlock()} macro simply marks the lock as no longer held. diff --git a/locking/locking.tex b/locking/locking.tex index 624890a2..25dc6378 100644 --- a/locking/locking.tex +++ b/locking/locking.tex @@ -486,7 +486,22 @@ this is unlikely. \subsubsection{Conditional Locking} \label{sec:locking:Conditional Locking} -\NoIndentAfterThis + +But suppose that there is no reasonable locking hierarchy. +This can happen in real life, for example, in layered network protocol stacks +where packets flow in both directions. +In the networking case, it might be necessary to hold the locks from +both layers when passing a packet from one layer to another. +Given that packets travel both up and down the protocol stack, this +is an excellent recipe for deadlock, as illustrated in +Listing~\ref{lst:locking:Protocol Layering and Deadlock}. +\begin{lineref}[ln:locking:Protocol Layering and Deadlock] +Here, a packet moving down the stack towards the wire must acquire +the next layer's lock out of order. +Given that packets moving up the stack away from the wire are acquiring +the locks in order, the lock acquisition in line~\lnref{acq} of the listing +can result in deadlock. +\end{lineref} \begin{listing}[tbp] \begin{linelabel}[ln:locking:Protocol Layering and Deadlock] @@ -504,21 +519,15 @@ spin_unlock(&nextlayer->lock1); \label{lst:locking:Protocol Layering and Deadlock} \end{listing} -But suppose that there is no reasonable locking hierarchy. -This can happen in real life, for example, in layered network protocol stacks -where packets flow in both directions. -In the networking case, it might be necessary to hold the locks from -both layers when passing a packet from one layer to another. -Given that packets travel both up and down the protocol stack, this -is an excellent recipe for deadlock, as illustrated in -Listing~\ref{lst:locking:Protocol Layering and Deadlock}. -\begin{lineref}[ln:locking:Protocol Layering and Deadlock] -Here, a packet moving down the stack towards the wire must acquire -the next layer's lock out of order. -Given that packets moving up the stack away from the wire are acquiring -the locks in order, the lock acquisition in line~\lnref{acq} of the listing -can result in deadlock. -\end{lineref} +One way to avoid deadlocks in this case is to impose a locking hierarchy, +but when it is necessary to acquire a lock out of order, acquire it +conditionally, as shown in +Listing~\ref{lst:locking:Avoiding Deadlock Via Conditional Locking}. +\begin{lineref}[ln:locking:Avoiding Deadlock Via Conditional Locking] +Instead of unconditionally acquiring the layer-1 lock, line~\lnref{trylock} +conditionally acquires the lock using the \co{spin_trylock()} primitive. +This primitive acquires the lock immediately if the lock is available +(returning non-zero), and otherwise returns zero without acquiring the lock. \begin{listing}[tbp] \begin{linelabel}[ln:locking:Avoiding Deadlock Via Conditional Locking] @@ -546,16 +555,6 @@ retry: \label{lst:locking:Avoiding Deadlock Via Conditional Locking} \end{listing} -One way to avoid deadlocks in this case is to impose a locking hierarchy, -but when it is necessary to acquire a lock out of order, acquire it -conditionally, as shown in -Listing~\ref{lst:locking:Avoiding Deadlock Via Conditional Locking}. -\begin{lineref}[ln:locking:Avoiding Deadlock Via Conditional Locking] -Instead of unconditionally acquiring the layer-1 lock, line~\lnref{trylock} -conditionally acquires the lock using the \co{spin_trylock()} primitive. -This primitive acquires the lock immediately if the lock is available -(returning non-zero), and otherwise returns zero without acquiring the lock. - If \co{spin_trylock()} was successful, line~\lnref{l1_proc} does the needed layer-1 processing. Otherwise, line~\lnref{rel2} releases the lock, and @@ -818,7 +817,13 @@ quite useful in many settings. \subsection{Livelock and Starvation} \label{sec:locking:Livelock and Starvation} -\NoIndentAfterThis + +Although conditional locking can be an effective deadlock-avoidance +mechanism, it can be abused. +Consider for example the beautifully symmetric example shown in +Listing~\ref{lst:locking:Abusing Conditional Locking}. +This example's beauty hides an ugly livelock. +To see this, consider the following sequence of events: \begin{listing}[tbp] \begin{linelabel}[ln:locking:Abusing Conditional Locking] @@ -856,13 +861,6 @@ retry: \lnlbl[thr2:retry] \label{lst:locking:Abusing Conditional Locking} \end{listing} -Although conditional locking can be an effective deadlock-avoidance -mechanism, it can be abused. -Consider for example the beautifully symmetric example shown in -Listing~\ref{lst:locking:Abusing Conditional Locking}. -This example's beauty hides an ugly livelock. -To see this, consider the following sequence of events: - \begin{lineref}[ln:locking:Abusing Conditional Locking] \begin{enumerate} \item Thread~1 acquires \co{lock1} on line~\lnref{thr1:acq1}, then invokes @@ -1680,13 +1678,6 @@ environments. \subsection{Sample Exclusive-Locking Implementation Based on Atomic Exchange} \label{sec:locking:Sample Exclusive-Locking Implementation Based on Atomic Exchange} -\NoIndentAfterThis - -\begin{listing}[tbp] -\input{CodeSamples/locking/xchglock@lock_unlock.fcv} -\caption{Sample Lock Based on Atomic Exchange} -\label{lst:locking:Sample Lock Based on Atomic Exchange} -\end{listing} \begin{lineref}[ln:locking:xchglock:lock_unlock] This section reviews the implementation shown in @@ -1697,6 +1688,12 @@ The initial value of this lock is zero, meaning ``unlocked'', as shown on line~\lnref{initval}. \end{lineref} +\begin{listing}[tbp] +\input{CodeSamples/locking/xchglock@lock_unlock.fcv} +\caption{Sample Lock Based on Atomic Exchange} +\label{lst:locking:Sample Lock Based on Atomic Exchange} +\end{listing} + \QuickQuiz{} \begin{lineref}[ln:locking:xchglock:lock_unlock] Why not rely on the C language's default initialization of diff --git a/memorder/memorder.tex b/memorder/memorder.tex index b2d3e194..034413b5 100644 --- a/memorder/memorder.tex +++ b/memorder/memorder.tex @@ -1904,12 +1904,6 @@ Cumulativity nevertheless has limits, which are examined in the next section. \subsubsection{Propagation} \label{sec:memorder:Propagation} -\begin{listing}[tbp] -\input{CodeSamples/formal/litmus/C-W+RWC+o-r+a-o+o-mb-o@xxxxxxxxx} -\caption{W+RWC Litmus Test With Release (No Ordering)} -\label{lst:memorder:W+RWC Litmus Test With Release (No Ordering)} -\end{listing} - \begin{lineref}[ln:formal:C-W+RWC+o-r+a-o+o-mb-o:whole] \Cref{lst:memorder:W+RWC Litmus Test With Release (No Ordering)} (\path{C-W+RWC+o-r+a-o+o-mb-o.litmus}) @@ -1923,6 +1917,12 @@ load (\clnref{P1:ld}) and \co{P2()}'s store (\clnref{P2:st}). This means that the \co{exists} clause on \clnref{exists} really can trigger. \end{lineref} +\begin{listing}[tbp] +\input{CodeSamples/formal/litmus/C-W+RWC+o-r+a-o+o-mb-o@xxxxxxxxx} +\caption{W+RWC Litmus Test With Release (No Ordering)} +\label{lst:memorder:W+RWC Litmus Test With Release (No Ordering)} +\end{listing} + \QuickQuiz{} But it is not necessary to worry about propagation unless there are at least three threads in the litmus test, right? @@ -2170,12 +2170,6 @@ memory accesses is covered in the next section. \subsubsection{Release-Acquire Chains} \label{sec:memorder:Release-Acquire Chains} -\begin{listing}[tbp] -\input{CodeSamples/formal/litmus/C-LB+a-r+a-r+a-r+a-r@xxxxxxxxx} -\caption{Long LB Release-Acquire Chain} -\label{lst:memorder:Long LB Release-Acquire Chain} -\end{listing} - A minimal release-acquire chain was shown in \cref{lst:memorder:Enforcing Ordering of Load-Buffering Litmus Test}, but these chains can be much longer, as shown in @@ -2186,9 +2180,9 @@ from the passage of time, so that no matter how many threads are involved, the corresponding \co{exists} clause cannot trigger. \begin{listing}[tbp] -\input{CodeSamples/formal/litmus/C-ISA2+o-r+a-r+a-r+a-o@xxxxxxxxx} -\caption{Long ISA2 Release-Acquire Chain} -\label{lst:memorder:Long ISA2 Release-Acquire Chain} +\input{CodeSamples/formal/litmus/C-LB+a-r+a-r+a-r+a-r@xxxxxxxxx} +\caption{Long LB Release-Acquire Chain} +\label{lst:memorder:Long LB Release-Acquire Chain} \end{listing} Although release-acquire chains are inherently store-to-load creatures, @@ -2216,6 +2210,12 @@ Because \co{P3()}'s \co{READ_ONCE()} cannot be both before and after be true: \end{lineref} +\begin{listing}[tbp] +\input{CodeSamples/formal/litmus/C-ISA2+o-r+a-r+a-r+a-o@xxxxxxxxx} +\caption{Long ISA2 Release-Acquire Chain} +\label{lst:memorder:Long ISA2 Release-Acquire Chain} +\end{listing} + \begin{enumerate} \item \co{P3()}'s \co{READ_ONCE()} came after \co{P0()}'s \co{WRITE_ONCE()}, so that the \co{READ_ONCE()} returned @@ -3306,12 +3306,6 @@ These questions are addressed in the following sections. \subsubsection{RCU Read-Side Ordering} \label{sec:memorder:RCU Read-Side Ordering} -\begin{listing}[tb] -\input{CodeSamples/formal/herd/C-LB+rl-o-o-rul+rl-o-o-rul@xxxxxxxxx} -\caption{RCU Readers Provide No Lock-Like Ordering} -\label{lst:memorder:RCU Readers Provide No Lock-Like Ordering} -\end{listing} - On their own, RCU's read-side primitives \co{rcu_read_lock()} and \co{rcu_read_unlock()} provide no ordering whatsoever. In particular, despite their names, they do not act like locks, as can @@ -3323,9 +3317,9 @@ test's cycle is allowed: Both instances of the \co{r1} register can have final values of 1. \begin{listing}[tb] -\input{CodeSamples/formal/herd/C-LB+o-rl-rul-o+o-rl-rul-o@xxxxxxxxx} -\caption{RCU Readers Provide No Barrier-Like Ordering} -\label{lst:memorder:RCU Readers Provide No Barrier-Like Ordering} +\input{CodeSamples/formal/herd/C-LB+rl-o-o-rul+rl-o-o-rul@xxxxxxxxx} +\caption{RCU Readers Provide No Lock-Like Ordering} +\label{lst:memorder:RCU Readers Provide No Lock-Like Ordering} \end{listing} Nor do these primitives have barrier-like ordering properties, @@ -3335,6 +3329,12 @@ at least not unless there is a grace period in the mix, as can be seen in This litmus test's cycle is allowed, as can be verified by running \co{herd} on this litmus test. +\begin{listing}[tb] +\input{CodeSamples/formal/herd/C-LB+o-rl-rul-o+o-rl-rul-o@xxxxxxxxx} +\caption{RCU Readers Provide No Barrier-Like Ordering} +\label{lst:memorder:RCU Readers Provide No Barrier-Like Ordering} +\end{listing} + Of course, lack of ordering in both these litmus tests should be absolutely no surprise, given that both \co{rcu_read_lock()} and \co{rcu_read_unlock()} are no-ops in the QSBR implementation of RCU. @@ -3342,12 +3342,6 @@ are no-ops in the QSBR implementation of RCU. \subsubsection{RCU Update-Side Ordering} \label{sec:memorder:RCU Update-Side Ordering} -\begin{listing}[tb] -\input{CodeSamples/formal/herd/C-SB+o-rcusync-o+o-rcusync-o@xxxxxxxxx} -\caption{RCU Updaters Provide Full Ordering} -\label{lst:memorder:RCU Updaters Provide Full Ordering} -\end{listing} - In contrast with RCU readers, the RCU updater-side functions \co{synchronize_rcu()} and \co{synchronize_rcu_expedited()} provide memory ordering at least as strong as \co{smp_mb()}, @@ -3358,6 +3352,12 @@ This test's cycle is prohibited, just as would be the case with This should be no surprise given the information presented in \cref{tab:memorder:Linux-Kernel Memory-Ordering Cheat Sheet}. +\begin{listing}[tb] +\input{CodeSamples/formal/herd/C-SB+o-rcusync-o+o-rcusync-o@xxxxxxxxx} +\caption{RCU Updaters Provide Full Ordering} +\label{lst:memorder:RCU Updaters Provide Full Ordering} +\end{listing} + \subsubsection{RCU Readers: Before and After} \label{sec:memorder:RCU Readers: Before and After} diff --git a/together/applyrcu.tex b/together/applyrcu.tex index 2a2bdf75..24f611c7 100644 --- a/together/applyrcu.tex +++ b/together/applyrcu.tex @@ -114,12 +114,6 @@ held constant, ensuring that \co{read_count()} sees consistent data. \subsubsection{Implementation} -\begin{listing}[bp] -\input{CodeSamples/count/count_end_rcu@xxxxxxxxx} -\caption{RCU and Per-Thread Statistical Counters} -\label{lst:together:RCU and Per-Thread Statistical Counters} -\end{listing} - \begin{lineref}[ln:count:count_end_rcu:whole] \Clnrefrange{struct:b}{struct:e} of \cref{lst:together:RCU and Per-Thread Statistical Counters} @@ -131,6 +125,12 @@ This structure allows a given execution of \co{read_count()} to see a total that is consistent with the indicated set of running threads. +\begin{listing}[bp] +\input{CodeSamples/count/count_end_rcu@xxxxxxxxx} +\caption{RCU and Per-Thread Statistical Counters} +\label{lst:together:RCU and Per-Thread Statistical Counters} +\end{listing} + \Clnrefrange{perthread:b}{perthread:e} contain the definition of the per-thread \co{counter} variable, the global pointer \co{countarrayp} referencing @@ -329,6 +329,12 @@ tradeoff. \subsection{Array and Length} \label{sec:together:Array and Length} +Suppose we have an RCU-protected variable-length array, as shown in +\cref{lst:together:RCU-Protected Variable-Length Array}. +The length of the array \co{->a[]} can change dynamically, and at any +given time, its length is given by the field \co{->length}. +Of course, this introduces the following race condition: + \begin{listing}[tbp] \begin{VerbatimL}[tabsize=8] struct foo { @@ -340,12 +346,6 @@ struct foo { \label{lst:together:RCU-Protected Variable-Length Array} \end{listing} -Suppose we have an RCU-protected variable-length array, as shown in -\cref{lst:together:RCU-Protected Variable-Length Array}. -The length of the array \co{->a[]} can change dynamically, and at any -given time, its length is given by the field \co{->length}. -Of course, this introduces the following race condition: - \begin{enumerate} \item The array is initially 16 characters long, and thus \co{->length} is equal to 16. @@ -412,6 +412,18 @@ A more general version of this approach is presented in the next section. \subsection{Correlated Fields} \label{sec:together:Correlated Fields} +Suppose that each of Sch\"odinger's animals is represented by the +data element shown in +\cref{lst:together:Uncorrelated Measurement Fields}. +The \co{meas_1}, \co{meas_2}, and \co{meas_3} fields are a set +of correlated measurements that are updated periodically. +It is critically important that readers see these three values from +a single measurement update: If a reader sees an old value of +\co{meas_1} but new values of \co{meas_2} and \co{meas_3}, that +reader will become fatally confused. +How can we guarantee that readers will see coordinated sets of these +three values? + \begin{listing}[tbp] \begin{VerbatimL}[tabsize=8] struct animal { @@ -427,18 +439,6 @@ struct animal { \label{lst:together:Uncorrelated Measurement Fields} \end{listing} -Suppose that each of Sch\"odinger's animals is represented by the -data element shown in -\cref{lst:together:Uncorrelated Measurement Fields}. -The \co{meas_1}, \co{meas_2}, and \co{meas_3} fields are a set -of correlated measurements that are updated periodically. -It is critically important that readers see these three values from -a single measurement update: If a reader sees an old value of -\co{meas_1} but new values of \co{meas_2} and \co{meas_3}, that -reader will become fatally confused. -How can we guarantee that readers will see coordinated sets of these -three values? - One approach would be to allocate a new \co{animal} structure, copy the old structure into the new structure, update the new structure's \co{meas_1}, \co{meas_2}, and \co{meas_3} fields, -- 2.17.1