[PATCH -perfbook v2 6/9] treewide: Make end-of-sentence periods be at end of lines

Akira Yokosawa <akiyks@xxxxxxxxx> · Wed, 28 Apr 2021 00:29:24 +0900

Signed-off-by: Akira Yokosawa <akiyks@xxxxxxxxx>
---
 SMPdesign/SMPdesign.tex              |  36 +++---
 SMPdesign/beyond.tex                 |   3 +-
 SMPdesign/criteria.tex               |   4 +-
 advsync/rt.tex                       |   5 +-
 appendix/whymb/whymemorybarriers.tex |  71 ++++++-----
 count/count.tex                      |  11 +-
 datastruct/datastruct.tex            |  15 ++-
 debugging/debugging.tex              |  12 +-
 defer/defer.tex                      |   4 +-
 defer/hazptr.tex                     |   7 +-
 defer/rcufundamental.tex             |   5 +-
 easy/easy.tex                        |   3 +-
 formal/dyntickrcu.tex                |  15 ++-
 formal/ppcmem.tex                    | 180 ++++++++++++++-------------
 formal/spinhint.tex                  | 148 ++++++++++++----------
 glossary.tex                         |   9 +-
 intro/intro.tex                      |  27 ++--
 legal.tex                            |   9 +-
 locking/locking.tex                  |   3 +-
 memorder/memorder.tex                |  12 +-
 owned/owned.tex                      |   5 +-
 together/refcnt.tex                  |   4 +-
 toolsoftrade/toolsoftrade.tex        |   3 +-
 23 files changed, 332 insertions(+), 259 deletions(-)

diff --git a/SMPdesign/SMPdesign.tex b/SMPdesign/SMPdesign.tex
index 5cc566a9..7d392a84 100644
--- a/SMPdesign/SMPdesign.tex
+++ b/SMPdesign/SMPdesign.tex
@@ -174,9 +174,9 @@ global locks.\footnote{
 	in Section~\ref{sec:SMPdesign:Data Locking}.}
 It is especially
 easy to retrofit an existing program to use code locking in
-order to run it on a multiprocessor.  If the program has
-only a single shared resource, code locking will even give
-optimal performance.
+order to run it on a multiprocessor.
+If the program has only a single shared resource, code locking
+will even give optimal performance.
 However, many of the larger and more complex programs
 require much of the execution to
 occur in \IXpl{critical section}, which in turn causes code locking
@@ -184,9 +184,9 @@ to sharply limits their scalability.
 
 Therefore, you should use code locking on programs that spend
 only a small fraction of their execution time in critical sections or
-from which only modest scaling is required.  In these cases,
-code locking will provide a relatively simple program that is
-very similar to its sequential counterpart,
+from which only modest scaling is required.
+In these cases, code locking will provide a relatively simple
+program that is very similar to its sequential counterpart,
 as can be seen in
 Listing~\ref{lst:SMPdesign:Code-Locking Hash Table Search}.
 However, note that the simple return of the comparison in
@@ -498,11 +498,13 @@ Data ownership might seem arcane, but it is used very frequently:
 	(such as {\tt auto} variables in C
 	and C++) are owned by that CPU or process.
 \item	An instance of a user interface owns the corresponding
-	user's context.  It is very common for applications
-	interacting with parallel database engines to be
-	written as if they were entirely sequential programs.
+	user's context.
+	It is very common for applications interacting with parallel
+	database engines to be written as if they were entirely
+	sequential programs.
 	Such applications own the user interface and his current
-	action.  Explicit parallelism is thus confined to the
+	action.
+	Explicit parallelism is thus confined to the
 	database engine itself.
 \item	Parametric simulations are often trivially parallelized
 	by granting each thread ownership of a particular region
@@ -777,8 +779,9 @@ parallelize the common-case code path without incurring the complexity
 that would be required to aggressively parallelize the entire algorithm.
 You must understand not only the specific algorithm you wish
 to parallelize, but also the workload that the algorithm will
-be subjected to.  Great creativity and design
-effort is often required to construct a parallel fastpath.
+be subjected to.
+Great creativity and design effort is often required to construct
+a parallel fastpath.
 
 Parallel fastpath combines different patterns (one for the
 fastpath, one elsewhere) and is therefore a template pattern.
@@ -1200,8 +1203,8 @@ this book.
 	\begin{description}
 	\item[$g$]	Number of blocks globally available.
 	\item[$i$]	Number of blocks left in the initializing thread's
-			per-thread pool.  (This is one reason you needed
-			to look at the code!)
+			per-thread pool.
+			(This is one reason you needed to look at the code!)
 	\item[$m$]	Allocation/free run length.
 	\item[$n$]	Number of threads, excluding the initialization thread.
 	\item[$p$]	Per-thread maximum block consumption, including
@@ -1209,8 +1212,9 @@ this book.
 			remaining in the per-thread pool.
 	\end{description}
 
-	The values $g$, $m$, and $n$ are given.  The value for $p$ is
-	$m$ rounded up to the next multiple of $s$, as follows:
+	The values $g$, $m$, and $n$ are given.
+	The value for $p$ is $m$ rounded up to the next multiple of $s$,
+	as follows:
 
 	\begin{equation}
 		p = s \left \lceil \frac{m}{s} \right \rceil
diff --git a/SMPdesign/beyond.tex b/SMPdesign/beyond.tex
index e308f1d5..bd0fe6f1 100644
--- a/SMPdesign/beyond.tex
+++ b/SMPdesign/beyond.tex
@@ -159,7 +159,8 @@ line~\lnref{recordnext} records this cell in the next
 slot of the \co{->visited[]} array,
 line~\lnref{next:visited} indicates that this slot
 is now full, and line~\lnref{mark:visited} marks this cell as visited and also records
-the distance from the maze start.  Line~\lnref{ret:success} then returns success.
+the distance from the maze start.
+Line~\lnref{ret:success} then returns success.
 \end{fcvref}
 
 \begin{fcvref}[ln:SMPdesign:SEQ Helper Pseudocode:find]
diff --git a/SMPdesign/criteria.tex b/SMPdesign/criteria.tex
index 0e581f15..915454e1 100644
--- a/SMPdesign/criteria.tex
+++ b/SMPdesign/criteria.tex
@@ -141,8 +141,8 @@ parallel program.
 	most-restrictive exclusive-lock critical section.
 \item	Contention effects consume the excess CPU and/or
 	wallclock time when the actual speedup is less than
-	the number of available CPUs.  The
-	larger the gap between the number of CPUs
+	the number of available CPUs.
+	The larger the gap between the number of CPUs
 	and the actual speedup, the less efficiently the
 	CPUs will be used.
 	Similarly, the greater the desired efficiency, the smaller
diff --git a/advsync/rt.tex b/advsync/rt.tex
index 71ab0661..e939a029 100644
--- a/advsync/rt.tex
+++ b/advsync/rt.tex
@@ -1149,8 +1149,9 @@ priority-inversion conundrum:
 
 \begin{enumerate}
 \item	Only allow one read-acquisition of a given reader-writer lock
-	at a time.  (This is the approach traditionally taken by
-	the Linux kernel's \rt\ patchset.)
+	at a time.
+	(This is the approach traditionally taken by the Linux
+	kernel's \rt\ patchset.)
 \item	Only allow $N$ read-acquisitions of a given reader-writer lock
 	at a time, where $N$ is the number of CPUs.
 \item	Only allow $N$ read-acquisitions of a given reader-writer lock
diff --git a/appendix/whymb/whymemorybarriers.tex b/appendix/whymb/whymemorybarriers.tex
index 7ec718bb..ea9fd14b 100644
--- a/appendix/whymb/whymemorybarriers.tex
+++ b/appendix/whymb/whymemorybarriers.tex
@@ -403,7 +403,8 @@ levels of the system architecture.
 	responses totally saturate the system bus?
 }\QuickQuizAnswerM{
 	It might, if large-scale multiprocessors were in fact implemented
-	that way.  Larger multiprocessors, particularly NUMA machines,
+	that way.
+	Larger multiprocessors, particularly NUMA machines,
 	tend to use so-called ``directory-based'' cache-coherence
 	protocols to avoid this and other problems.
 }\QuickQuizEndM
@@ -413,15 +414,18 @@ levels of the system architecture.
 	anyway, why bother with SMP at all?
 }\QuickQuizAnswerE{
 	There has been quite a bit of controversy on this topic over
-	the past few decades.  One answer is that the cache-coherence
+	the past few decades.
+	One answer is that the cache-coherence
 	protocols are quite simple, and therefore can be implemented
 	directly in hardware, gaining bandwidths and latencies
-	unattainable by software message passing.  Another answer is that
+	unattainable by software message passing.
+	Another answer is that
 	the real truth is to be found in economics due to the relative
 	prices of large SMP machines and that of clusters of smaller
-	SMP machines.  A third answer is that the SMP programming
-	model is easier to use than that of distributed systems, but
-	a rebuttal might note the appearance of HPC clusters and MPI\@.
+	SMP machines.
+	A third answer is that the SMP programming model is easier to
+	use than that of distributed systems, but a rebuttal might note
+	the appearance of HPC clusters and MPI\@.
 	And so the argument continues.
 }\QuickQuizEndE
 }
@@ -784,9 +788,10 @@ Suppose further that the cache line containing ``a'' resides only in CPU~1's
 cache, and that the cache line containing ``b'' is owned by CPU~0.
 Then the sequence of operations might be as follows:
 \begin{sequence}
-\item	CPU~0 executes \co{a = 1}.  The cache line is not in
-	CPU~0's cache, so CPU~0 places the new value of ``a'' in its
-	store buffer and transmits a ``read invalidate'' message.
+\item	CPU~0 executes \co{a = 1}.
+	The cache line is not in CPU~0's cache, so CPU~0 places the new
+	value of ``a'' in its store buffer and transmits a ``read
+	invalidate'' message.
 	\label{seq:app:whymb:Store Buffers and Memory Barriers}
 \item	CPU~1 executes \co{while (b == 0) continue}, but the cache line
 	containing ``b'' is not in its cache.
@@ -853,9 +858,10 @@ applied.
 
 With this latter approach the sequence of operations might be as follows:
 \begin{sequence}
-\item	CPU~0 executes \co{a = 1}.  The cache line is not in
-	CPU~0's cache, so CPU~0 places the new value of ``a'' in its
-	store buffer and transmits a ``read invalidate'' message.
+\item	CPU~0 executes \co{a = 1}.
+	The cache line is not in CPU~0's cache, so CPU~0 places the new
+	value of ``a'' in its store buffer and transmits a ``read
+	invalidate'' message.
 \item	CPU~1 executes \co{while (b == 0) continue}, but the cache line
 	containing ``b'' is not in its cache.
 	It therefore transmits a ``read'' message.
@@ -1045,11 +1051,11 @@ void bar(void)
 Then the sequence of operations might be as follows:
 \begin{fcvref}[ln:app:whymb:Breaking mb]
 \begin{sequence}
-\item	CPU~0 executes \co{a = 1}.  The corresponding
-	cache line is read-only in
-	CPU~0's cache, so CPU~0 places the new value of ``a'' in its
-	store buffer and transmits an ``invalidate'' message in order
-	to flush the corresponding cache line from CPU~1's cache.
+\item	CPU~0 executes \co{a = 1}.
+	The corresponding cache line is read-only in CPU~0's cache, so
+	CPU~0 places the new value of ``a'' in its store buffer and
+	transmits an ``invalidate'' message in order to flush the
+	corresponding cache line from CPU~1's cache.
 	\label{seq:app:whymb:Invalidate Queues and Memory Barriers}
 \item	CPU~1 executes \co{while (b == 0) continue}, but the cache line
 	containing ``b'' is not in its cache.
@@ -1186,11 +1192,11 @@ void bar(void)
 \begin{fcvref}[ln:app:whymb:Add mb]
 With this change, the sequence of operations might be as follows:
 \begin{sequence}
-\item	CPU~0 executes \co{a = 1}.  The corresponding
-	cache line is read-only in
-	CPU~0's cache, so CPU~0 places the new value of ``a'' in its
-	store buffer and transmits an ``invalidate'' message in order
-	to flush the corresponding cache line from CPU~1's cache.
+\item	CPU~0 executes \co{a = 1}.
+	The corresponding cache line is read-only in CPU~0's cache,
+	so CPU~0 places the new value of ``a'' in its store buffer and
+	transmits an ``invalidate'' message in order to flush the
+	corresponding cache line from CPU~1's cache.
 \item	CPU~1 executes \co{while (b == 0) continue}, but the cache line
 	containing ``b'' is not in its cache.
 	It therefore transmits a ``read'' message.
@@ -1335,15 +1341,17 @@ constraints~\cite{PaulMcKenney2005i,PaulMcKenney2005j}:
 	its own memory accesses in order?
 	Why or why not?
 }\QuickQuizAnswer{
-	No.  Consider the case where a thread migrates from one CPU to
+	No.
+	Consider the case where a thread migrates from one CPU to
 	another, and where the destination CPU perceives the source
-	CPU's recent memory operations out of order.  To preserve
-	user-mode sanity, kernel hackers must use memory barriers in
-	the context-switch path.  However, the locking already required
-	to safely do a context switch should automatically provide
-	the memory barriers needed to cause the user-level task to see
-	its own accesses in order.  That said, if you are designing a
-	super-optimized scheduler, either in the kernel or at user level,
+	CPU's recent memory operations out of order.
+	To preserve user-mode sanity, kernel hackers must use memory
+	barriers in the context-switch path.
+	However, the locking already required to safely do a context
+	switch should automatically provide the memory barriers needed
+	to cause the user-level task to see its own accesses in order.
+        That said, if you are designing a super-optimized scheduler,
+	either in the kernel or at user level,
 	please keep this scenario in mind!
 }\QuickQuizEnd
 
@@ -1422,7 +1430,8 @@ the assertion.
 	between CPU~1's ``while'' and assignment to ``c''?
 	Why or why not?
 }\QuickQuizAnswer{
-	No.  Such a memory barrier would only force ordering local to CPU~1.
+	No.
+	Such a memory barrier would only force ordering local to CPU~1.
 	It would have no effect on the relative ordering of CPU~0's and
 	CPU~1's accesses, so the assertion could still fail.
 	However, all mainstream computer systems provide one mechanism
diff --git a/count/count.tex b/count/count.tex
index b89a566c..b69515a1 100644
--- a/count/count.tex
+++ b/count/count.tex
@@ -33,7 +33,7 @@ counting.
 }\EQuickQuizEnd
 
 \EQuickQuiz{
-	{ \bfseries Network-packet counting problem. }
+	{\bfseries Network-packet counting problem.}
 	Suppose that you need to collect statistics on the number
 	of networking packets transmitted and received.
 	Packets might be transmitted or received by any CPU on the system.
@@ -62,7 +62,7 @@ counting.
 \QuickQuizLabel{\QcountQstatcnt}
 
 \EQuickQuiz{
-	{ \bfseries Approximate structure-allocation limit problem. }
+	{\bfseries Approximate structure-allocation limit problem.}
 	Suppose that you need to maintain a count of the number of
 	structures allocated in order to fail any allocations
 	once the number of structures in use exceeds a limit
@@ -84,7 +84,7 @@ counting.
 \QuickQuizLabel{\QcountQapproxcnt}
 
 \EQuickQuiz{
-	{ \bfseries Exact structure-allocation limit problem. }
+	{\bfseries Exact structure-allocation limit problem.}
 	Suppose that you need to maintain a count of the number of
 	structures allocated in order to fail any allocations
 	once the number of structures in use exceeds an exact limit
@@ -111,7 +111,7 @@ counting.
 \QuickQuizLabel{\QcountQexactcnt}
 
 \EQuickQuiz{
-	{ \bfseries Removable I/O device access-count problem. }
+	{\bfseries Removable I/O device access-count problem.}
 	Suppose that you need to maintain a reference count on a
 	heavily used removable mass-storage device, so that you
 	can tell the user when it is safe to remove the device.
@@ -1829,7 +1829,8 @@ with exact limits.
 \section{Exact Limit Counters}
 \label{sec:count:Exact Limit Counters}
 %
-\epigraph{Exactitude can be expensive.  Spend wisely.}{\emph{Unknown}}
+\epigraph{Exactitude can be expensive.
+	  Spend wisely.}{\emph{Unknown}}
 
 To solve the exact structure-allocation limit problem noted in
 \QuickQuizRef{\QcountQexactcnt},
diff --git a/datastruct/datastruct.tex b/datastruct/datastruct.tex
index 26d5c556..6e18fda9 100644
--- a/datastruct/datastruct.tex
+++ b/datastruct/datastruct.tex
@@ -4,8 +4,9 @@
 
 \QuickQuizChapter{chp:Data Structures}{Data Structures}{qqzdatastruct}
 %
-\Epigraph{Bad programmers worry about the code. Good programmers worry
-	  about data structures and their relationships.}
+\Epigraph{Bad programmers worry about the code.
+	  Good programmers worry about data structures and their
+          relationships.}
 	 {\emph{Linus Torvalds}}
 
 Serious discussions of algorithms include time complexity of their
@@ -124,9 +125,10 @@ permitting a hash table to access its elements extremely efficiently.
 
 In addition, each bucket has its own lock, so that elements in different
 buckets of the hash table may be added, deleted, and looked up completely
-independently.  A large hash table with a large number of buckets (and
-thus locks), with each bucket containing a small number of elements
-should therefore provide excellent scalability.
+independently.
+A large hash table with a large number of buckets (and thus locks), with
+each bucket containing a small number of elements should therefore provide
+excellent scalability.
 
 \subsection{Hash-Table Implementation}
 \label{sec:datastruct:Hash-Table Implementation}
@@ -1806,7 +1808,8 @@ library~\cite{MathieuDesnoyers2009URCU}.
 \section{Other Data Structures}
 \label{sec:datastruct:Other Data Structures}
 %
-\epigraph{All life is an experiment.  The more experiments you make the better.}
+\epigraph{All life is an experiment.
+	  The more experiments you make the better.}
 	 {\emph{Ralph Waldo Emerson}}
 
 The preceding sections have focused on data structures that enhance
diff --git a/debugging/debugging.tex b/debugging/debugging.tex
index 9d6e7c36..4c77453d 100644
--- a/debugging/debugging.tex
+++ b/debugging/debugging.tex
@@ -635,7 +635,8 @@ you already have a good test suite.
 \section{Tracing}
 \label{sec:debugging:Tracing}
 %
-\epigraph{The machine knows what is wrong.  Make it tell you.}{\emph{Unknown}}
+\epigraph{The machine knows what is wrong.
+	  Make it tell you.}{\emph{Unknown}}
 
 When all else fails, add a \co{printk()}!
 Or a \co{printf()}, if you are working with user-mode C-language applications.
@@ -2524,9 +2525,9 @@ This script takes three optional arguments as follows:
 	into, for example, a divisor of four means that the first quarter of
 	the data elements will be assumed to be good.
 	This defaults to three.
-\item	[\lopt{relerr}\nf{:}] Relative measurement error.  The script
-	assumes that values that differ by less than this error are for all
-	intents and purposes equal.
+\item	[\lopt{relerr}\nf{:}] Relative measurement error.
+	The script assumes that values that differ by less than this
+        error are for all intents and purposes equal.
 	This defaults to 0.01, which is equivalent to 1\,\%.
 \item	[\lopt{trendbreak}\nf{:}] Ratio of inter-element spacing
 	constituting a break in the trend of the data.
@@ -2720,7 +2721,8 @@ In short, validation always will require some measure of the behavior of
 the system.
 To be at all useful, this measure must be a severe summarization of the
 system, which in turn means that it can be misleading.
-So as the saying goes, ``Be careful.  It is a real world out there.''
+So as the saying goes, ``Be careful.
+It is a real world out there.''
 
 But what if you are working on the Linux kernel, which as of 2017 was
 estimated to have more than 20 billion instances running throughout
diff --git a/defer/defer.tex b/defer/defer.tex
index 2d049cfe..3fecc2ac 100644
--- a/defer/defer.tex
+++ b/defer/defer.tex
@@ -7,8 +7,8 @@
 \Epigraph{All things come to those who wait.}{\emph{Violet Fane}}
 
 The strategy of deferring work goes back before the dawn of recorded
-history. It has occasionally been derided as procrastination or
-even as sheer laziness.
+history.
+It has occasionally been derided as procrastination or even as sheer laziness.
 However, in the last few decades workers have recognized this strategy's value
 in simplifying and streamlining parallel algorithms~\cite{Kung80,HMassalinPhD}.
 Believe it or not, ``laziness'' in parallel programming often outperforms and
diff --git a/defer/hazptr.tex b/defer/hazptr.tex
index e9264fb7..7c7dd831 100644
--- a/defer/hazptr.tex
+++ b/defer/hazptr.tex
@@ -214,9 +214,10 @@ Otherwise, the element's \co{->iface} field is returned to the caller.
 Note that line~\lnref{tryrecord} invokes \co{hp_try_record()} rather
 than the easier-to-use \co{hp_record()}, restarting the full search
 upon \co{hp_try_record()} failure.
-And such restarting is absolutely required for correctness.  To see this,
-consider a hazard-pointer-protected linked list containing elements~A,
-B, and~C that is subjected to the following sequence of events:
+And such restarting is absolutely required for correctness.
+To see this, consider a hazard-pointer-protected linked list
+containing elements~A, B, and~C that is subjected to the following
+sequence of events:
 \end{fcvref}
 
 \begin{enumerate}
diff --git a/defer/rcufundamental.tex b/defer/rcufundamental.tex
index 6ff1bd6b..00b80d63 100644
--- a/defer/rcufundamental.tex
+++ b/defer/rcufundamental.tex
@@ -265,8 +265,9 @@ In the figure, \co{P0()}'s access to \co{y} follows \co{P1()}'s access
 to this same variable, and thus follows the grace period generated by
 \co{P1()}'s call to \co{synchronize_rcu()}.
 It is therefore guaranteed that \co{P0()}'s access to \co{x} will follow
-\co{P1()}'s access.  In this case, if \co{r2}'s final value is 1, then
-\co{r1}'s final value is guaranteed to also be 1.
+\co{P1()}'s access.
+In this case, if \co{r2}'s final value is 1, then \co{r1}'s final value
+is guaranteed to also be 1.
 
 \QuickQuiz{
 	What would happen if the order of \co{P0()}'s two accesses was
diff --git a/easy/easy.tex b/easy/easy.tex
index 50a616cc..1ac5b419 100644
--- a/easy/easy.tex
+++ b/easy/easy.tex
@@ -60,7 +60,8 @@ things are covered in the next section.
 \label{sec:easy:Rusty Scale for API Design}
 %
 \epigraph{Finding the appropriate measurement is thus not a mathematical
-	  exercise.  It is a risk-taking judgment.}
+	  exercise.
+	  It is a risk-taking judgment.}
 	 {\emph{Peter Drucker}}
 % http://billhennessy.com/simple-strategies/2015/09/09/i-wish-drucker-never-said-it
 % Rusty is OK with this: July 19, 2006.
diff --git a/formal/dyntickrcu.tex b/formal/dyntickrcu.tex
index 98bdaf95..ea534b10 100644
--- a/formal/dyntickrcu.tex
+++ b/formal/dyntickrcu.tex
@@ -731,8 +731,9 @@ for the first condition:
 	and didn't take any interrupts, NMIs, SMIs, or whatever,
 	then it cannot be in the middle of an \co{rcu_read_lock()}, so
 	the next \co{rcu_read_lock()} it executes must use the new value
-	of the counter.  So we can safely pretend that this CPU
-	already acknowledged the counter.
+	of the counter.
+	So we can safely pretend that this CPU already acknowledged
+	the counter.
 \end{quote}
 
 The first condition does match this, because if \qco{curr == snap}
@@ -1104,7 +1105,8 @@ states, passing without errors.
 
 	\begin{quote}
 		Debugging is twice as hard as writing the code in the first
-		place. Therefore, if you write the code as cleverly as possible,
+		place.
+		Therefore, if you write the code as cleverly as possible,
 		you are, by definition, not smart enough to debug it.
 	\end{quote}
 
@@ -1161,9 +1163,10 @@ This effort provided some lessons (re)learned:
 
 \item	{\bf Always verify your verification code.}
 	The usual way to do this is to insert a deliberate bug
-	and verify that the verification code catches it.  Of course,
-	if the verification code fails to catch this bug, you may also
-	need to verify the bug itself, and so on, recursing infinitely.
+	and verify that the verification code catches it.
+	Of course, if the verification code fails to catch this bug,
+	you may also need to verify the bug itself, and so on,
+	recursing infinitely.
 	However, if you find yourself in this position,
 	getting a good night's sleep
 	can be an extremely effective debugging technique.
diff --git a/formal/ppcmem.tex b/formal/ppcmem.tex
index 95be861a..019d0161 100644
--- a/formal/ppcmem.tex
+++ b/formal/ppcmem.tex
@@ -99,30 +99,33 @@ exists						@lnlbl[assert:b]
 
 \begin{fcvref}[ln:formal:PPCMEM Litmus Test]
 In the example, \clnref{type} identifies the type of system (``ARM'' or
-``PPC'') and contains the title for the model. \Clnref{altname}
-provides a place for an
+``PPC'') and contains the title for the model.
+\Clnref{altname} provides a place for an
 alternative name for the test, which you will usually want to leave
-blank as shown in the above example. Comments can be inserted between
+blank as shown in the above example.
+Comments can be inserted between
 \clnref{altname,init:b} using the OCaml (or Pascal) syntax of \nbco{(* *)}.
 
 \Clnrefrange{init:b}{init:e} give initial values for all registers;
 each is of the form
 \co{P:R=V}, where \co{P} is the process identifier, \co{R} is the register
-identifier, and \co{V} is the value. For example, process 0's register
-r3 initially contains the value 2. If the value is a variable (\co{x},
-\co{y}, or \co{z} in the example) then the register is initialized to the
-address of the variable. It is also possible to initialize the contents
-of variables, for example, \co{x=1} initializes the value of \co{x} to
-1. Uninitialized variables default to the value zero, so that in the
+identifier, and \co{V} is the value.
+For example, process 0's register r3 initially contains the value~2.
+If the value is a variable (\co{x}, \co{y}, or \co{z} in the example)
+then the register is initialized to the address of the variable.
+It is also possible to initialize the contents of variables, for example,
+\co{x=1} initializes the value of \co{x} to~1.
+Uninitialized variables default to the value zero, so that in the
 example, \co{x}, \co{y}, and \co{z} are all initially zero.
 
 \Clnref{procid} provides identifiers for the two processes, so that
 the \co{0:r3=2} on \clnref{init:0} could instead have been written
-\co{P0:r3=2}. \Clnref{procid} is
-required, and the identifiers must be of the form \co{Pn}, where \co{n}
-is the column number, starting from zero for the left-most column. This
-may seem unnecessarily strict, but it does prevent considerable confusion
-in actual use.
+\co{P0:r3=2}.
+\Clnref{procid} is required, and the identifiers must be of the form
+\co{Pn}, where \co{n} is the column number, starting from zero for
+the left-most column.
+This may seem unnecessarily strict, but it does prevent considerable
+confusion in actual use.
 \end{fcvref}
 
 \QuickQuiz{
@@ -149,23 +152,23 @@ A given process can have empty lines, as is the case for P0's
 \clnref{P0empty} and P1's \clnrefrange{P1empty:b}{P1empty:e}.
 Labels and branches are permitted, as demonstrated by the branch
 on \clnref{P0bne} to the label on \clnref{P0fail1}.
-That said, too-free use of branches
-will expand the state space. Use of loops is a particularly good way to
-explode your state space.
+That said, too-free use of branches will expand the state space.
+Use of loops is a particularly good way to explode your state space.
 
 \Clnrefrange{assert:b}{assert:e} show the assertion, which in this case
-indicates that we
-are interested in whether P0's and P1's r3 registers can both contain
-zero after both threads complete execution. This assertion is important
-because there are a number of use cases that would fail miserably if
-both P0 and P1 saw zero in their respective r3 registers.
-
-This should give you enough information to construct simple litmus
-tests. Some additional documentation is available, though much of this
+indicates that we are interested in whether P0's and P1's r3 registers
+can both contain zero after both threads complete execution.
+This assertion is important because there are a number of use cases
+that would fail miserably if both P0 and P1 saw zero in their
+respective r3 registers.
+
+This should give you enough information to construct simple litmus tests.
+Some additional documentation is available, though much of this
 additional documentation is intended for a different research tool that
-runs tests on actual hardware. Perhaps more importantly, a large number of
-pre-existing litmus tests are available with the online tool (available
-via the ``Select ARM Test'' and ``Select POWER Test'' buttons at
+runs tests on actual hardware.
+Perhaps more importantly, a large number of pre-existing litmus tests
+are available with the online tool (available via the ``Select ARM Test''
+and ``Select POWER Test'' buttons at
 \url{https://www.cl.cam.ac.uk/~pes20/ppcmem/}).
 It is quite likely that one of these pre-existing litmus tests will
 answer your Power or \ARM\ memory-ordering question.
@@ -175,17 +178,18 @@ answer your Power or \ARM\ memory-ordering question.
 
 P0's \clnref{reginit,stw} are equivalent to the C statement \co{x=1}
 because \clnref{init:0} defines P0's register \co{r2} to be the address
-of \co{x}. P0's \clnref{P0lwarx,P0stwcx} are the mnemonics for
-load-linked (``load register
-exclusive'' in \ARM\ parlance and ``load reserve'' in Power parlance)
-and store-conditional (``store register exclusive'' in \ARM\ parlance),
-respectively. When these are used together, they form an atomic
-instruction sequence, roughly similar to the \IXacrml{cas} sequences
-exemplified by the x86 \co{lock;cmpxchg} instruction. Moving to a higher
-level of abstraction, the sequence from \clnrefrange{P0lwsync}{P0isync}
+of~\co{x}.
+P0's \clnref{P0lwarx,P0stwcx} are the mnemonics for load-linked
+(``load register exclusive'' in \ARM\ parlance and ``load reserve''
+in Power parlance) and store-conditional (``store register exclusive''
+in \ARM\ parlance), respectively.
+When these are used together, they form an atomic instruction sequence,
+roughly similar to the \IXacrml{cas} sequences exemplified by the
+x86 \co{lock;cmpxchg} instruction.
+Moving to a higher level of abstraction, the sequence from
+\clnrefrange{P0lwsync}{P0isync}
 is equivalent to the Linux kernel's \co{atomic_add_return(&z, 0)}.
-Finally, \clnref{P0lwz} is
-roughly equivalent to the C statement \co{r3=y}.
+Finally, \clnref{P0lwz} is roughly equivalent to the C statement \co{r3=y}.
 
 P1's \clnref{reginit,stw} are equivalent to the C statement \co{y=1},
 \clnref{P1sync}
@@ -203,11 +207,11 @@ and \clnref{P1lwz} is equivalent to the C statement \co{r3=x}.
 	The implementation of powerpc version of \co{atomic_add_return()}
 	loops when the \co{stwcx} instruction fails, which it communicates
 	by setting non-zero status in the condition-code register,
-	which in turn is tested by the \co{bne} instruction. Because actually
-	modeling the loop would result in state-space explosion, we
-	instead branch to the Fail: label, terminating the model with
-	the initial value of 2 in P0's \co{r3} register, which
-	will not trigger the exists assertion.
+	which in turn is tested by the \co{bne} instruction.
+	Because actually modeling the loop would result in state-space
+	explosion, we instead branch to the \co{Fail:} label,
+	terminating the model with the initial value of~2 in P0's \co{r3}
+        register, which will not trigger the exists assertion.
 
 	There is some debate about whether this trick is universally
 	applicable, but I have not seen an example where it fails.
@@ -369,38 +373,42 @@ cannot happen.
 \label{sec:formal:PPCMEM Discussion}
 
 These tools promise to be of great help to people working on low-level
-parallel primitives that run on \ARM\ and on Power. These tools do have
-some intrinsic limitations:
+parallel primitives that run on \ARM\ and on Power.
+These tools do have some intrinsic limitations:
 
 \begin{enumerate}
 \item	These tools are research prototypes, and as such are unsupported.
 \item	These tools do not constitute official statements by IBM or \ARM\
-	on their respective CPU architectures. For example, both
-	corporations reserve the right to report a bug at any time against
-	any version of any of these tools. These tools are therefore not a
-	substitute for careful stress testing on real hardware. Moreover,
-	both the tools and the model that they are based on are under
-	active development and might change at any time. On the other
-	hand, this model was developed in consultation with the relevant
-	hardware experts, so there is good reason to be confident that
-	it is a robust representation of the architectures.
+	on their respective CPU architectures.
+	For example, both corporations reserve the right to report a bug
+	at any time against any version of any of these tools.
+	These tools are therefore not a substitute for careful stress
+	testing on real hardware.
+	Moreover, both the tools and the model that they are based on
+	are under active development and might change at any time.
+	On the other hand, this model was developed in consultation
+	with the relevant hardware experts, so there is good reason to be
+	confident that it is a robust representation of the architectures.
 \item	These tools currently handle a subset of the instruction set.
 	This subset has been sufficient for my purposes, but your mileage
-	may vary. In particular, the tool handles only word-sized accesses
-	(32 bits), and the words accessed must be properly aligned.\footnote{
+	may vary.
+	In particular, the tool handles only word-sized accesses (32 bits),
+	and the words accessed must be properly aligned.\footnote{
 		But recent work focuses on mixed-size
 		accesses~\cite{Flur:2017:MCA:3093333.3009839}.}
 	In addition, the tool does not handle some of the weaker variants
 	of the \ARM\ memory-barrier instructions, nor does it handle
 	arithmetic.
 \item	The tools are restricted to small loop-free code fragments
-	running on small numbers of threads. Larger examples result
+	running on small numbers of threads.
+	Larger examples result
 	in state-space explosion, just as with similar tools such as
 	Promela and spin.
 \item	The full state-space search does not give any indication of how
-	each offending state was reached. That said, once you realize
-	that the state is in fact reachable, it is usually not too hard
-	to find that state using the interactive tool.
+	each offending state was reached.
+	That said, once you realize that the state is in fact reachable,
+	it is usually not too hard to find that state using the
+	interactive tool.
 \item	These tools are not much good for complex data structures, although
 	it is possible to create and traverse extremely simple linked
 	lists using initialization statements of the form
@@ -409,42 +417,46 @@ some intrinsic limitations:
 	Of course, handling such things would require that they be
 	formalized, which does not appear to be in the offing.
 \item	The tools will detect only those problems for which you code an
-	assertion. This weakness is common to all formal methods, and
-	is yet another reason why testing remains important. In the
-	immortal words of Donald Knuth quoted at the beginning of this
-	chapter, ``Beware of bugs in the above
-	code; I have only proved it correct, not tried it.''
+	assertion.
+	This weakness is common to all formal methods, and is yet another
+	reason why testing remains important.
+	In the immortal words of Donald Knuth quoted at the beginning of
+	this chapter, ``Beware of bugs in the above code;
+	I have only proved it correct, not tried it.''
 \end{enumerate}
 
 That said, one strength of these tools is that they are designed to
 model the full range of behaviors allowed by the architectures, including
 behaviors that are legal, but which current hardware implementations do
-not yet inflict on unwary software developers. Therefore, an algorithm
-that is vetted by these tools likely has some additional safety margin
-when running on real hardware. Furthermore, testing on real hardware can
-only find bugs; such testing is inherently incapable of proving a given
-usage correct. To appreciate this, consider that the researchers
-routinely ran in excess of 100 billion test runs on real hardware to
-validate their model.
+not yet inflict on unwary software developers.
+Therefore, an algorithm that is vetted by these tools likely has some
+additional safety margin when running on real hardware.
+Furthermore, testing on real hardware can only find bugs; such testing
+is inherently incapable of proving a given usage correct.
+To appreciate this, consider that the researchers routinely ran in excess
+of 100 billion test runs on real hardware to validate their model.
 In one case, behavior that is allowed by the architecture did not occur,
 despite 176 billion runs~\cite{JadeAlglave2011ppcmem}.
 In contrast, the
 full-state-space search allows the tool to prove code fragments correct.
 
 It is worth repeating that formal methods and tools are no substitute for
-testing. The fact is that producing large reliable concurrent software
-artifacts, the Linux kernel for example, is quite difficult. Developers
-must therefore be prepared to apply every tool at their disposal towards
-this goal. The tools presented in this chapter are able to locate bugs that
-are quite difficult to produce (let alone track down) via testing. On the
-other hand, testing can be applied to far larger bodies of software than
-the tools presented in this chapter are ever likely to handle. As always,
-use the right tools for the job!
+testing.
+The fact is that producing large reliable concurrent software artifacts,
+the Linux kernel for example, is quite difficult.
+Developers must therefore be prepared to apply every tool at their
+disposal towards this goal.
+The tools presented in this chapter are able to locate bugs that are
+quite difficult to produce (let alone track down) via testing.
+On the other hand, testing can be applied to far larger bodies of software
+than the tools presented in this chapter are ever likely to handle.
+As always, use the right tools for the job!
 
 Of course, it is always best to avoid the need to work at this level
 by designing your parallel code to be easily partitioned and then
 using higher-level primitives (such as locks, sequence counters, atomic
-operations, and RCU) to get your job done more straightforwardly. And even
-if you absolutely must use low-level memory barriers and read-modify-write
-instructions to get your job done, the more conservative your use of
-these sharp instruments, the easier your life is likely to be.
+operations, and RCU) to get your job done more straightforwardly.
+And even if you absolutely must use low-level memory barriers and
+read-modify-write instructions to get your job done, the more
+conservative your use of these sharp instruments, the easier your life
+is likely to be.
diff --git a/formal/spinhint.tex b/formal/spinhint.tex
index 305a7014..d05bab16 100644
--- a/formal/spinhint.tex
+++ b/formal/spinhint.tex
@@ -244,12 +244,13 @@ Given a source file \path{qrcu.spin}, one can use the following commands:
 \item	[\tco{spin -a qrcu.spin}]
 	Create a file \path{pan.c} that fully searches the state machine.
 \item	[\tco{cc -DSAFETY [-DCOLLAPSE] [-DMA=N] -o pan pan.c}]
-	Compile the generated state-machine search.  The \co{-DSAFETY}
-	generates optimizations that are appropriate if you have only
-	assertions (and perhaps \co{never} statements).  If you have
-	liveness, fairness, or forward-progress checks, you may need
-	to compile without \co{-DSAFETY}.  If you leave off \co{-DSAFETY}
-	when you could have used it, the program will let you know.
+	Compile the generated state-machine search.
+	The \co{-DSAFETY} generates optimizations that are appropriate
+        if you have only assertions (and perhaps \co{never} statements).
+	If you have liveness, fairness, or forward-progress checks,
+	you may need to compile without \co{-DSAFETY}.
+	If you leave off \co{-DSAFETY} when you could have used it,
+	the program will let you know.
 
 	The optimizations produced by \co{-DSAFETY} greatly speed things
 	up, so you should use it when you can.
@@ -263,9 +264,10 @@ Given a source file \path{qrcu.spin}, one can use the following commands:
 	Another optional flag \co{-DMA=N} generates code for a slow
 	but aggressive state-space memory compression mode.
 \item	[\tco{./pan [-mN] [-wN]}]
-	This actually searches the state space.  The number of states
-	can reach into the tens of millions with very small state
-	machines, so you will need a machine with large memory.
+	This actually searches the state space.
+	The number of states can reach into the tens of millions with
+	very small state machines, so you will need a machine with
+	large memory.
 	For example, \path{qrcu.spin} with 3~updaters and 2~readers required
 	10.5\,GB of memory even with the \co{-DCOLLAPSE} flag.
 
@@ -276,23 +278,23 @@ Given a source file \path{qrcu.spin}, one can use the following commands:
 
 	The \co{-wN} option specifies the hashtable size.
 	The default for full state-space search is \co{-w24}.\footnote{
-		As of Spin Version 6.4.6 and 6.4.8. In the online manual of
-		Spin dated 10 July 2011, the default for exhaustive search
-		mode is said to be \co{-w19}, which does not meet
-		the actual behavior.}
+		As of Spin Version 6.4.6 and 6.4.8.
+		In the online manual of Spin dated 10 July 2011, the
+		default for exhaustive search mode is said to be \co{-w19},
+		which does not meet the actual behavior.}
 
 	If you aren't sure whether your machine has enough memory,
-	run \co{top} in one window and \co{./pan} in another.  Keep the
-	focus on the \co{./pan} window so that you can quickly kill
-	execution if need be.  As soon as CPU time drops much below
-	100\,\%, kill \co{./pan}.  If you have removed focus from the
-	window running \co{./pan}, you may wait a long time for the
-	windowing system to grab enough memory to do anything for
-	you.
+	run \co{top} in one window and \co{./pan} in another.
+	Keep the focus on the \co{./pan} window so that you can quickly
+	kill execution if need be.
+	As soon as CPU time drops much below 100\,\%, kill \co{./pan}.
+	If you have removed focus from the window running \co{./pan},
+	you may wait a long time for the windowing system to grab
+	enough memory to do anything for you.
 
 	Another option to avoid memory exhaustion is the
-	\co{-DMEMLIM=N} compiler flag. \co{-DMEMLIM=2000}
-	would set the maximum of 2\,GB.
+	\co{-DMEMLIM=N} compiler flag.
+	\co{-DMEMLIM=2000} would set the maximum of 2\,GB.
 
 	Don't forget to capture the output, especially
 	if you are working on a remote machine.
@@ -320,7 +322,8 @@ Promela will provide some surprises to people used to coding in C,
 C++, or Java.
 
 \begin{enumerate}
-\item	In C, ``\co{;}'' terminates statements.  In Promela it separates them.
+\item	In C, ``\co{;}'' terminates statements.
+	In Promela it separates them.
 	Fortunately, more recent versions of Spin have become
 	much more forgiving of ``extra'' semicolons.
 \item	Promela's looping construct, the \co{do} statement, takes
@@ -328,44 +331,52 @@ C++, or Java.
 	This \co{do} statement closely resembles a looping if-then-else
 	statement.
 \item	In C's \co{switch} statement, if there is no matching case, the whole
-	statement is skipped.  In Promela's equivalent, confusingly called
-	\co{if}, if there is no matching guard expression, you get an error
-	without a recognizable corresponding error message.
+	statement is skipped.
+	In Promela's equivalent, confusingly called \co{if}, if there is
+	no matching guard expression, you get an error without a
+	recognizable corresponding error message.
 	So, if the error output indicates an innocent line of code,
 	check to see if you left out a condition from an \co{if} or \co{do}
 	statement.
 \item	When creating stress tests in C, one usually races suspect operations
-	against each other repeatedly.	In Promela, one instead sets up
-	a single race, because Promela will search out all the possible
-	outcomes from that single race.	Sometimes you do need to loop
-	in Promela, for example, if multiple operations overlap, but
+	against each other repeatedly.
+	In Promela, one instead sets up a single race, because Promela
+	will search out all the possible outcomes from that single race.
+	Sometimes you do need to loop in Promela, for example,
+	if multiple operations overlap, but
 	doing so greatly increases the size of your state space.
 \item	In C, the easiest thing to do is to maintain a loop counter to track
-	progress and terminate the loop.  In Promela, loop counters
-	must be avoided like the plague because they cause the state
-	space to explode.  On the other hand, there is no penalty for
-	infinite loops in Promela as long as none of the variables
-	monotonically increase or decrease---Promela will figure out
-	how many passes through the loop really matter, and automatically
-	prune execution beyond that point.
+	progress and terminate the loop.
+	In Promela, loop counters must be avoided like the plague
+	because they cause the state space to explode.
+	On the other hand, there is no penalty for infinite loops in
+	Promela as long as none of the variables monotonically increase
+	or decrease---Promela will figure out how many passes through
+	the loop really matter, and automatically prune execution beyond
+	that point.
 \item	In C torture-test code, it is often wise to keep per-task control
-	variables.  They are cheap to read, and greatly aid in debugging the
-	test code.  In Promela, per-task control variables should be used
-	only when there is no other alternative.  To see this, consider
-	a 5-task verification with one bit each to indicate completion.
-	This gives 32 states.  In contrast, a simple counter would have
-	only six states, more than a five-fold reduction.  That factor
-	of five might not seem like a problem, at least not until you
-	are struggling with a verification program possessing more than
-	150 million states consuming more than 10\,GB of memory!
+	variables.
+	They are cheap to read, and greatly aid in debugging the test code.
+	In Promela, per-task control variables should be used only when
+	there is no other alternative.
+	To see this, consider a 5-task verification with one bit each
+	to indicate completion.
+	This gives 32 states.
+	In contrast, a simple counter would have only six states,
+	more than a five-fold reduction.
+	That factor of five might not seem like a problem, at least
+	not until you are struggling with a verification program
+	possessing more than 150 million states consuming more
+	than 10\,GB of memory!
 \item	One of the most challenging things both in C torture-test code and
-	in Promela is formulating good assertions.  Promela also allows
-	\co{never} claims that act like an assertion replicated
-	between every line of code.
+	in Promela is formulating good assertions.
+	Promela also allows \co{never} claims that act like an assertion
+	replicated between every line of code.
 \item	Dividing and conquering is extremely helpful in Promela in keeping
-	the state space under control.  Splitting a large model into two
-	roughly equal halves will result in the state space of each
-	half being roughly the square root of the whole.
+	the state space under control.
+	Splitting a large model into two roughly equal halves will result
+	in the state space of each half being roughly the square root of
+	the whole.
 	For example, a million-state combined model might reduce to a
 	pair of thousand-state models.
 	Not only will Promela handle the two smaller models much more
@@ -382,10 +393,11 @@ is a bit abusive.
 The following tricks can help you to abuse Promela safely:
 
 \begin{enumerate}
-\item	Memory reordering.  Suppose you have a pair of statements
-	copying globals x and y to locals r1 and r2, where ordering
-	matters (e.g., unprotected by locks), but where you have
-	no memory barriers.  This can be modeled in Promela as follows:
+\item	Memory reordering.
+	Suppose you have a pair of statements copying globals x and y
+	to locals r1 and r2, where ordering matters
+        (e.g., unprotected by locks), but where you have no memory barriers.
+	This can be modeled in Promela as follows:
 
 \begin{VerbatimN}[samepage=true]
 if
@@ -405,10 +417,11 @@ fi
 	if used too heavily.
 	In addition, it requires you to anticipate possible reorderings.
 
-\item	State reduction.  If you have complex assertions, evaluate
-	them under \co{atomic}.  After all, they are not part of the
-	algorithm.  One example of a complex assertion (to be discussed
-	in more detail later) is as shown in
+\item	State reduction.
+	If you have complex assertions, evaluate them under \co{atomic}.
+	After all, they are not part of the algorithm.
+	One example of a complex assertion (to be discussed in more
+	detail later) is as shown in
 	Listing~\ref{lst:formal:Complex Promela Assertion}.
 
 	There is no reason to evaluate this assertion
@@ -588,9 +601,9 @@ As expected, this run has no assertion failures (``errors: 0'').
 	\item	The declaration of \co{sum} should be moved to within
 		the init block, since it is not used anywhere else.
 	\item	The assertion code should be moved outside of the
-		initialization loop.  The initialization loop can
-		then be placed in an atomic block, greatly reducing
-		the state space (by how much?).
+		initialization loop.
+		The initialization loop can then be placed in an atomic
+		block, greatly reducing the state space (by how much?).
 	\item	The atomic block covering the assertion code should
 		be extended to include the initialization of \co{sum}
 		and \co{j}, and also to cover the assertion.
@@ -787,7 +800,8 @@ this update still be in progress.
 	\end{fcvref}
 }\QuickQuizAnswerB{
 	Because those operations are for the benefit of the
-	assertion only.  They are not part of the algorithm itself.
+	assertion only.
+	They are not part of the algorithm itself.
 	There is therefore no harm in marking them atomic, and
 	so marking them greatly reduces the state space that must
 	be searched by the Promela model.
@@ -800,7 +814,8 @@ this update still be in progress.
 	\emph{really} necessary?
         \end{fcvref}
 }\QuickQuizAnswerE{
-	Yes.  To see this, delete these lines and run the model.
+	Yes.
+	To see this, delete these lines and run the model.
 
 	Alternatively, consider the following sequence of steps:
 
@@ -810,7 +825,8 @@ this update still be in progress.
 		the value of \co{ctr[1]} is two.
 	\item	An updater starts executing, and sees that the sum of
 		the counters is two so that the fastpath cannot be
-		executed.  It therefore acquires the lock.
+		executed.
+		It therefore acquires the lock.
 	\item	A second updater starts executing, and fetches the value
 		of \co{ctr[0]}, which is zero.
 	\item	The first updater adds one to \co{ctr[0]}, flips
diff --git a/glossary.tex b/glossary.tex
index 98a27438..e8d93c90 100644
--- a/glossary.tex
+++ b/glossary.tex
@@ -284,10 +284,11 @@
 	set of critical sections guarded by that lock, while a
 	``reader-writer lock'' permits any number of reading
 	threads, or but one writing thread, into the set of critical
-	sections guarded by that lock.  (Just to be clear, the presence
-	of a writer thread in any of a given reader-writer lock's
-	critical sections will prevent any reader from entering
-	any of that lock's critical sections and vice versa.)
+	sections guarded by that lock.
+	(Just to be clear, the presence	of a writer thread in any of
+	a given reader-writer lock's critical sections will prevent
+	any reader from entering any of that lock's critical sections
+	and vice versa.)
 \item[\IX{Lock Contention}:]
 	A lock is said to be suffering contention when it is being
 	used so heavily that there is often a CPU waiting on it.
diff --git a/intro/intro.tex b/intro/intro.tex
index 4f38f376..812b34fd 100644
--- a/intro/intro.tex
+++ b/intro/intro.tex
@@ -572,7 +572,8 @@ programming environments:
 	Its productivity is believed by many to be even lower than that
 	of C/C++ ``locking plus threads'' environments.
 \item[OpenMP:] This set of compiler directives can be used
-	to parallelize loops.  It is thus quite specific to this
+	to parallelize loops.
+	It is thus quite specific to this
 	task, and this specificity often limits its performance.
 	It is, however, much easier to use than MPI or C/C++
 	``locking plus threads.''
@@ -834,17 +835,21 @@ reduce the amount of data that must be read.
 }\QuickQuizAnswer{
 	There are any number of potential bottlenecks:
 	\begin{enumerate}
-	\item	Main memory.  If a single thread consumes all available
+	\item	Main memory.
+		If a single thread consumes all available
 		memory, additional threads will simply page themselves
 		silly.
-	\item	Cache.  If a single thread's cache footprint completely
+	\item	Cache.
+		If a single thread's cache footprint completely
 		fills any shared CPU cache(s), then adding more threads
 		will simply thrash those affected caches, as will be
 		seen in \cref{chp:Data Structures}.
-	\item	Memory bandwidth.  If a single thread consumes all available
+	\item	Memory bandwidth.
+		If a single thread consumes all available
 		memory bandwidth, additional threads will simply
 		result in additional queuing on the system interconnect.
-	\item	I/O bandwidth.  If a single thread is I/O bound,
+	\item	I/O bandwidth.
+		If a single thread is I/O bound,
 		adding more threads will simply result in them all
 		waiting in line for the affected I/O resource.
 	\end{enumerate}
@@ -960,11 +965,13 @@ overlap computation and I/O so as to fully utilize I/O devices.
 	There are any number of potential limits on the number of
 	threads:
 	\begin{enumerate}
-	\item	Main memory.  Each thread consumes some memory
+	\item	Main memory.
+		Each thread consumes some memory
 		(for its stack if nothing else), so that excessive
 		numbers of threads can exhaust memory, resulting
 		in excessive paging or memory-allocation failures.
-	\item	I/O bandwidth.  If each thread initiates a given
+	\item	I/O bandwidth.
+		If each thread initiates a given
 		amount of mass-storage I/O or networking traffic,
 		excessive numbers of threads can result in excessive
 		I/O queuing delays, again degrading performance.
@@ -1239,8 +1246,10 @@ monograph~\cite{AndrewDBirrell1989Threads} is especially telling:
 
 \begin{quote}
 	Writing concurrent programs has a reputation for being exotic
-	and difficult. I~believe it is neither. You need a system
-	that provides you with good primitives and suitable libraries,
+	and difficult.
+	I~believe it is neither.
+	You need a system that provides you with good primitives
+	and suitable libraries,
 	you need a basic caution and carefulness, you need an armory of
 	useful techniques, and you need to know of the common pitfalls.
 	I~hope that this paper has helped you towards sharing my belief.
diff --git a/legal.tex b/legal.tex
index 1dd6c6bb..f443df37 100644
--- a/legal.tex
+++ b/legal.tex
@@ -31,10 +31,11 @@ States license.\footnote{
 	\url{https://creativecommons.org/licenses/by-sa/3.0/us/}}
 In brief, you may use the contents of this document for any purpose,
 personal, commercial, or otherwise, so long as attribution to the
-authors is maintained.  Likewise, the document may be modified,
-and derivative works and translations made available, so long as
-such modifications and derivations are offered to the public on equal
-terms as the non-source-code text and images in the original document.
+authors is maintained.
+Likewise, the document may be modified, and derivative works and
+translations made available, so long as such modifications and
+derivations are offered to the public on equal terms as the
+non-source-code text and images in the original document.
 
 Source code is covered by various versions of the GPL\@.\footnote{
 	\url{https://www.gnu.org/licenses/gpl-2.0.html}}
diff --git a/locking/locking.tex b/locking/locking.tex
index 6bc93e49..406c5034 100644
--- a/locking/locking.tex
+++ b/locking/locking.tex
@@ -1710,7 +1710,8 @@ be required for the foreseeable future.
 \label{sec:locking:Locking Implementation Issues}
 %
 \epigraph{When you translate a dream into reality, it's never a full
-	  implementation.  It is easier to dream than to do.}
+	  implementation.
+	  It is easier to dream than to do.}
 	 {\emph{Shai Agassi}}
 
 Developers are almost always best-served by using whatever locking
diff --git a/memorder/memorder.tex b/memorder/memorder.tex
index 397fb101..8c9be547 100644
--- a/memorder/memorder.tex
+++ b/memorder/memorder.tex
@@ -2136,7 +2136,8 @@ page~\pageref{fig:memorder:A Variable With More Simultaneous Values}.
 	(\path{C-2+2W+o-o+o-o.litmus}).
 }\QuickQuizEnd
 
-But sometimes time really is on our side.  Read on!
+But sometimes time really is on our side.
+Read on!
 
 \subsubsection{Happens-Before}
 \label{sec:memorder:Happens-Before}
@@ -3158,8 +3159,9 @@ The following list of rules summarizes the lessons of this section:
 \end{enumerate}
 
 Again, many popular languages were designed with single-threaded use
-in mind.  Successful multithreaded use of these languages requires you
-to pay special attention to your memory references and dependencies.
+in mind.
+Successful multithreaded use of these languages requires you to pay
+special attention to your memory references and dependencies.
 
 \section{Higher-Level Primitives}
 \label{sec:memorder:Higher-Level Primitives}
@@ -3868,8 +3870,8 @@ can make portability a challenge, as indicated by
 In fact, some software environments simply prohibit
 direct use of memory-ordering operations, restricting the programmer
 to mutual-exclusion primitives that incorporate them to the extent that
-they are required.  Please note that this section is not intended to be
-a reference manual
+they are required.
+Please note that this section is not intended to be a reference manual
 covering all (or even most) aspects of each CPU family, but rather
 a high-level overview providing a rough comparison.
 For full details, see the reference manual for the CPU of interest.
diff --git a/owned/owned.tex b/owned/owned.tex
index aa4c0901..df2a1cce 100644
--- a/owned/owned.tex
+++ b/owned/owned.tex
@@ -4,7 +4,10 @@
 
 \QuickQuizChapter{chp:Data Ownership}{Data Ownership}{qqzowned}
 %
-\Epigraph{It is mine, I tell you. My own. My precious. Yes, my precious.}
+\Epigraph{It is mine, I tell you.
+	  My own.
+	  My precious.
+	  Yes, my precious.}
 	 {\emph{Gollum in ``The Fellowship of the Ring'', J.R.R.~Tolkien}}
 
 One of the simplest ways to avoid the synchronization overhead that
diff --git a/together/refcnt.tex b/together/refcnt.tex
index 57a1754e..ae8644e4 100644
--- a/together/refcnt.tex
+++ b/together/refcnt.tex
@@ -5,8 +5,8 @@
 \section{Refurbish Reference Counting}
 \label{sec:together:Refurbish Reference Counting}
 %
-\epigraph{Counting is the religion of this generation.  It is its
-	  hope and its salvation.}
+\epigraph{Counting is the religion of this generation.
+	  It is its hope and its salvation.}
 	 {\emph{Gertrude Stein}}
 
 Although reference counting is a conceptually simple technique,
diff --git a/toolsoftrade/toolsoftrade.tex b/toolsoftrade/toolsoftrade.tex
index 77a3790d..b358c22d 100644
--- a/toolsoftrade/toolsoftrade.tex
+++ b/toolsoftrade/toolsoftrade.tex
@@ -289,7 +289,8 @@ in which the child sets a global variable \co{x} to 1 on line~\lnref{setx},
 prints a message on line~\lnref{print:c}, and exits on line~\lnref{exit:s}.
 The parent continues at line~\lnref{waitall}, where it waits on the child,
 and on line~\lnref{print:p} finds that its copy of the variable \co{x} is
-still zero. The output is thus as follows:
+still zero.
+The output is thus as follows:
 \end{fcvref}
 
 \begin{VerbatimU}
-- 
2.17.1