On Sat, Oct 14, 2017 at 07:57:24AM +0900, Akira Yokosawa wrote: > >From c8849b06b1563d9d82dfaf0f120932c4998c8b03 Mon Sep 17 00:00:00 2001 > From: Akira Yokosawa <akiyks@xxxxxxxxx> > Date: Sat, 14 Oct 2017 00:12:52 +0900 > Subject: [PATCH 1/2] Convert code snippets to 'listing' env (together, advsync, rt, future) > > Signed-off-by: Akira Yokosawa <akiyks@xxxxxxxxx> These also look good, applied and pushed, thank you! Thanx, Paul > --- > advsync/advsync.tex | 10 ++++----- > future/htm.tex | 16 +++++++-------- > rt/rt.tex | 34 +++++++++++++++---------------- > together/applyrcu.tex | 56 +++++++++++++++++++++++++-------------------------- > together/refcnt.tex | 38 +++++++++++++++++----------------- > 5 files changed, 77 insertions(+), 77 deletions(-) > > diff --git a/advsync/advsync.tex b/advsync/advsync.tex > index adf1dc9..3b04aff 100644 > --- a/advsync/advsync.tex > +++ b/advsync/advsync.tex > @@ -188,7 +188,7 @@ algorithm as wait-free. > This algorithm is probably the most heavily used NBS algorithm in > the Linux kernel. > > -\begin{figure}[tbp] > +\begin{listing}[tbp] > { \scriptsize > \begin{verbbox} > 1 static inline bool > @@ -217,14 +217,14 @@ the Linux kernel. > \centering > \theverbbox > \caption{NBS Enqueue Algorithm} > -\label{fig:count:NBS Enqueue Algorithm} > -\end{figure} > +\label{lst:count:NBS Enqueue Algorithm} > +\end{listing} > > Another common NBS algorithm is the atomic queue where elements are > enqueued using an atomic exchange instruction~\cite{MagedMichael1993JPDC}, > followed by a store into the \co{->next} pointer of the new element's > predecessor, as shown in > -Figure~\ref{fig:count:NBS Enqueue Algorithm}, > +Listing~\ref{lst:count:NBS Enqueue Algorithm}, > which shows the userspace-RCU library > implementation~\cite{MathieuDesnoyers2009URCU}. > Line~9 updates the tail pointer to reference the new element while > @@ -240,7 +240,7 @@ Although mutual exclusion is required to dequeue a single element > removal of the entire contents of the queue. > What is not possible is to dequeue any given element in a non-blocking > manner: The enqueuer might have failed between lines~9 and~10 of the > -figure, so that the element in question is only partially enqueued. > +listing, so that the element in question is only partially enqueued. > This results in a half-NBS algorithm where enqueues are NBS but > dequeues are blocking. > This algorithm is nevertheless used in practice, in part because > diff --git a/future/htm.tex b/future/htm.tex > index 80d3e1e..7c9f610 100644 > --- a/future/htm.tex > +++ b/future/htm.tex > @@ -733,7 +733,7 @@ Therefore, the medium-priority threads are in effect blocking the > high-priority process, which is the rationale for the name ``priority > inversion.'' > > -\begin{figure}[tbp] > +\begin{listing}[tbp] > { \scriptsize > \begin{verbbox} > 1 void boostee(void) > @@ -765,8 +765,8 @@ inversion.'' > \centering > \theverbbox > \caption{Exploiting Priority Boosting} > -\label{fig:future:Exploiting Priority Boosting} > -\end{figure} > +\label{lst:future:Exploiting Priority Boosting} > +\end{listing} > > One way to avoid priority inversion is \emph{priority inheritance}, > in which a high-priority thread blocked on a lock temporarily donates > @@ -774,10 +774,10 @@ its priority to the lock's holder, which is also called \emph{priority > boosting}. > However, priority boosting can be used for things other than avoiding > priority inversion, as shown in > -Figure~\ref{fig:future:Exploiting Priority Boosting}. > -Lines~1-12 of this figure show a low-priority process that must > +Listing~\ref{lst:future:Exploiting Priority Boosting}. > +Lines~1-12 of this listing show a low-priority process that must > nevertheless run every millisecond or so, while lines~14-24 of > -this same figure show a high-priority process that uses priority > +this same listing show a high-priority process that uses priority > boosting to ensure that \co{boostee()} runs periodically as needed. > > The \co{boostee()} function arranges this by always holding one of > @@ -786,7 +786,7 @@ the two \co{boost_lock[]} locks, so that lines~20-21 of > > \QuickQuiz{} > But the \co{boostee()} function in > - Figure~\ref{fig:future:Exploiting Priority Boosting} > + Listing~\ref{lst:future:Exploiting Priority Boosting} > alternatively acquires its locks in reverse order! > Won't this result in deadlock? > \QuickQuizAnswer{ > @@ -816,7 +816,7 @@ this thread might never again get a chance to run. > > And if the \co{boostee()} thread is not holding the lock, then > the \co{booster()} thread's empty critical section on lines~20 and~21 of > -Figure~\ref{fig:future:Exploiting Priority Boosting} > +Listing~\ref{lst:future:Exploiting Priority Boosting} > will become an empty transaction that has no effect, so that > \co{boostee()} never runs. > This example illustrates some of the subtle consequences of > diff --git a/rt/rt.tex b/rt/rt.tex > index f1c0ae1..d9c32aa 100644 > --- a/rt/rt.tex > +++ b/rt/rt.tex > @@ -1150,7 +1150,7 @@ sections~\cite{DinakarGuniguntala2008IBMSysJ}. > Otherwise, long RCU read-side critical sections would result in > excessive real-time latencies. > > -\begin{figure}[tb] > +\begin{listing}[tb] > { \scriptsize > \begin{verbbox} > 1 void __rcu_read_lock(void) > @@ -1180,8 +1180,8 @@ excessive real-time latencies. > \centering > \theverbbox > \caption{Preemptible Linux-Kernel RCU} > -\label{fig:rt:Preemptible Linux-Kernel RCU} > -\end{figure} > +\label{lst:rt:Preemptible Linux-Kernel RCU} > +\end{listing} > > A preemptible RCU implementation was therefore added to the Linux kernel. > This implementation avoids the need to individually track the state of > @@ -1194,7 +1194,7 @@ of the current grace period and > while in one of those pre-existing critical sections have removed > themselves from their lists. > A simplified version of this implementation is shown in > -Figure~\ref{fig:rt:Preemptible Linux-Kernel RCU}. > +Listing~\ref{lst:rt:Preemptible Linux-Kernel RCU}. > The \co{__rcu_read_lock()} function spans lines~1-5 and > the \co{__rcu_read_unlock()} function spans lines~7-22. > > @@ -1241,7 +1241,7 @@ count on line~20. > \QuickQuiz{} > Suppose that preemption occurs just after the load from > \co{t->rcu_read_unlock_special.s} on line~17 of > - Figure~\ref{fig:rt:Preemptible Linux-Kernel RCU}. > + Listing~\ref{lst:rt:Preemptible Linux-Kernel RCU}. > Mightn't that result in the task failing to invoke > \co{rcu_read_unlock_special()}, thus failing to remove itself > from the list of tasks blocking the current grace period, > @@ -1460,7 +1460,7 @@ real-time use. > OSADL runs long-term tests of systems, so referring to their > website (\url{http://osadl.org/}) can be helpful. > > -\begin{figure}[tb] > +\begin{listing}[tb] > { \scriptsize > \begin{verbbox} > 1 cd /sys/kernel/debug/tracing > @@ -1473,8 +1473,8 @@ website (\url{http://osadl.org/}) can be helpful. > \centering > \theverbbox > \caption{Locating Sources of OS Jitter} > -\label{fig:rt:Locating Sources of OS Jitter} > -\end{figure} > +\label{lst:rt:Locating Sources of OS Jitter} > +\end{listing} > > Unfortunately, this list of OS-jitter sources can never be complete, > as it will change with each new version of the kernel. > @@ -1482,7 +1482,7 @@ This makes it necessary to be able to track down additional sources > of OS jitter. > Given a CPU $N$ running a CPU-bound usermode thread, the > commands shown in > -Figure~\ref{fig:rt:Locating Sources of OS Jitter} > +Listing~\ref{lst:rt:Locating Sources of OS Jitter} > will produce a list of all the times that this CPU entered the kernel. > Of course, the \co{N} on line~5 must be replaced with the > number of the CPU in question, and the \co{1} on line~2 may be increased > @@ -1704,7 +1704,7 @@ or about 9,000 degrees of rotation per second, which translates to > We therefore need to schedule the fuel injection to within a time > interval of about 100 microseconds. > > -\begin{figure}[tb] > +\begin{listing}[tb] > { \scriptsize > \begin{verbbox} > 1 if (clock_gettime(CLOCK_REALTIME, ×tart) != 0) { > @@ -1724,15 +1724,15 @@ interval of about 100 microseconds. > \centering > \theverbbox > \caption{Timed-Wait Test Program} > -\label{fig:rt:Timed-Wait Test Program} > -\end{figure} > +\label{lst:rt:Timed-Wait Test Program} > +\end{listing} > > Suppose that a timed wait was to be used to initiate fuel injection, > although if you are building an engine, I hope you supply a rotation > sensor. > We need to test the timed-wait functionality, perhaps using the test program > shown in > -Figure~\ref{fig:rt:Timed-Wait Test Program}. > +Listing~\ref{lst:rt:Timed-Wait Test Program}. > Unfortunately, if we run this program, we can get unacceptable timer > jitter, even in a -rt kernel. > > @@ -1789,7 +1789,7 @@ in fields imaginatively named \co{a}, \co{b}, and \co{c}. > Otherwise, \co{cur_cal} points to a dynamically allocated > structure providing the current calibration values. > > -\begin{figure}[tb] > +\begin{listing}[tb] > { \scriptsize > \begin{verbbox} > 1 struct calibration { > @@ -1832,10 +1832,10 @@ structure providing the current calibration values. > \centering > \theverbbox > \caption{Real-Time Calibration Using RCU} > -\label{fig:rt:Real-Time Calibration Using RCU} > -\end{figure} > +\label{lst:rt:Real-Time Calibration Using RCU} > +\end{listing} > > -Figure~\ref{fig:rt:Real-Time Calibration Using RCU} > +Listing~\ref{lst:rt:Real-Time Calibration Using RCU} > shows how RCU can be used to solve this problem. > Lookups are deterministic, as shown in \co{calc_control()} > on lines~9-15, consistent with real-time requirements. > diff --git a/together/applyrcu.tex b/together/applyrcu.tex > index d477ab5..5e7efff 100644 > --- a/together/applyrcu.tex > +++ b/together/applyrcu.tex > @@ -108,7 +108,7 @@ held constant, ensuring that \co{read_count()} sees consistent data. > > \subsubsection{Implementation} > > -\begin{figure}[bp] > +\begin{listing}[bp] > { \scriptsize > \begin{verbbox} > 1 struct countarray { > @@ -186,11 +186,11 @@ held constant, ensuring that \co{read_count()} sees consistent data. > \centering > \theverbbox > \caption{RCU and Per-Thread Statistical Counters} > -\label{fig:together:RCU and Per-Thread Statistical Counters} > -\end{figure} > +\label{lst:together:RCU and Per-Thread Statistical Counters} > +\end{listing} > > Lines~1-4 of > -Figure~\ref{fig:together:RCU and Per-Thread Statistical Counters} > +Listing~\ref{lst:together:RCU and Per-Thread Statistical Counters} > show the \co{countarray} structure, which contains a > \co{->total} field for the count from previously exited threads, > and a \co{counterp[]} array of pointers to the per-thread > @@ -235,7 +235,7 @@ Line~43 picks up the current thread's index, line~45 acquires > \QuickQuiz{} > Hey!!! > Line~46 of > - Figure~\ref{fig:together:RCU and Per-Thread Statistical Counters} > + Listing~\ref{lst:together:RCU and Per-Thread Statistical Counters} > modifies a value in a pre-existing \co{countarray} structure! > Didn't you say that this structure, once made available to > \co{read_count()}, remained constant??? > @@ -278,7 +278,7 @@ Line~69 can then safely free the old \co{countarray} structure. > > \QuickQuiz{} > Wow! > - Figure~\ref{fig:together:RCU and Per-Thread Statistical Counters} > + Listing~\ref{lst:together:RCU and Per-Thread Statistical Counters} > contains 69 lines of code, compared to only 42 in > Listing~\ref{lst:count:Per-Thread Statistical Counters}. > Is this extra complexity really worth it? > @@ -303,7 +303,7 @@ Line~69 can then safely free the old \co{countarray} structure. > Listing~\ref{lst:count:Per-Thread Statistical Counters}, but > with all the scalability and performance benefits of the > implementation shown in > - Figure~\ref{fig:together:RCU and Per-Thread Statistical Counters}! > + Listing~\ref{lst:together:RCU and Per-Thread Statistical Counters}! > } \QuickQuizEnd > > Use of RCU enables exiting threads to wait until other threads are > @@ -389,7 +389,7 @@ tradeoff. > \subsection{Array and Length} > \label{sec:together:Array and Length} > > -\begin{figure}[tbp] > +\begin{listing}[tbp] > { \scriptsize > \begin{verbbox} > 1 struct foo { > @@ -401,11 +401,11 @@ tradeoff. > \centering > \theverbbox > \caption{RCU-Protected Variable-Length Array} > -\label{fig:together:RCU-Protected Variable-Length Array} > -\end{figure} > +\label{lst:together:RCU-Protected Variable-Length Array} > +\end{listing} > > Suppose we have an RCU-protected variable-length array, as shown in > -Figure~\ref{fig:together:RCU-Protected Variable-Length Array}. > +Listing~\ref{lst:together:RCU-Protected Variable-Length Array}. > The length of the array \co{->a[]} can change dynamically, and at any > given time, its length is given by the field \co{->length}. > Of course, this introduces the following race condition: > @@ -429,7 +429,7 @@ covered in Chapter~\ref{chp:memorder:Memory Ordering}. > This works, but incurs read-side overhead and, perhaps worse, requires > use of explicit memory barriers. > > -\begin{figure}[tbp] > +\begin{listing}[tbp] > { \scriptsize > \begin{verbbox} > 1 struct foo_a { > @@ -445,12 +445,12 @@ use of explicit memory barriers. > \centering > \theverbbox > \caption{Improved RCU-Protected Variable-Length Array} > -\label{fig:together:Improved RCU-Protected Variable-Length Array} > -\end{figure} > +\label{lst:together:Improved RCU-Protected Variable-Length Array} > +\end{listing} > > A better approach is to put the value and the array into the same structure, > as shown in > -Figure~\ref{fig:together:Improved RCU-Protected Variable-Length Array}. > +Listing~\ref{lst:together:Improved RCU-Protected Variable-Length Array}. > Allocating a new array (\co{foo_a} structure) then automatically provides > a new place for the array length. > This means that if any CPU picks up a reference to \co{->fa}, it is > @@ -480,7 +480,7 @@ A more general version of this approach is presented in the next section. > \subsection{Correlated Fields} > \label{sec:together:Correlated Fields} > > -\begin{figure}[tbp] > +\begin{listing}[tbp] > { \scriptsize > \begin{verbbox} > 1 struct animal { > @@ -496,12 +496,12 @@ A more general version of this approach is presented in the next section. > \centering > \theverbbox > \caption{Uncorrelated Measurement Fields} > -\label{fig:together:Uncorrelated Measurement Fields} > -\end{figure} > +\label{lst:together:Uncorrelated Measurement Fields} > +\end{listing} > > Suppose that each of Sch\"odinger's animals is represented by the > data element shown in > -Figure~\ref{fig:together:Uncorrelated Measurement Fields}. > +Listing~\ref{lst:together:Uncorrelated Measurement Fields}. > The \co{meas_1}, \co{meas_2}, and \co{meas_3} fields are a set > of correlated measurements that are updated periodically. > It is critically important that readers see these three values from > @@ -521,7 +521,7 @@ measurement values, but it requires copying a large structure due > to the \co{->photo[]} field. > This copying might incur unacceptably large overhead. > > -\begin{figure}[tbp] > +\begin{listing}[tbp] > { \scriptsize > \begin{verbbox} > 1 struct measurement { > @@ -541,11 +541,11 @@ This copying might incur unacceptably large overhead. > \centering > \theverbbox > \caption{Correlated Measurement Fields} > -\label{fig:together:Correlated Measurement Fields} > -\end{figure} > +\label{lst:together:Correlated Measurement Fields} > +\end{listing} > > Another approach is to insert a level of indirection, as shown in > -Figure~\ref{fig:together:Correlated Measurement Fields}. > +Listing~\ref{lst:together:Correlated Measurement Fields}. > When a new measurement is taken, a new \co{measurement} structure > is allocated, filled in with the measurements, and the \co{animal} > structure's \co{->mp} field is updated to point to this new > @@ -555,13 +555,13 @@ can be freed. > > \QuickQuiz{} > But cant't the approach shown in > - Figure~\ref{fig:together:Correlated Measurement Fields} > + Listing~\ref{lst:together:Correlated Measurement Fields} > result in extra cache misses, in turn resulting in additional > read-side overhead? > \QuickQuizAnswer{ > Indeed it can. > > -\begin{figure}[tbp] > +\begin{listing}[tbp] > { \scriptsize > \begin{verbbox} > 1 struct measurement { > @@ -582,11 +582,11 @@ can be freed. > \centering > \theverbbox > \caption{Localized Correlated Measurement Fields} > -\label{fig:together:Localized Correlated Measurement Fields} > -\end{figure} > +\label{lst:together:Localized Correlated Measurement Fields} > +\end{listing} > > One way to avoid this cache-miss overhead is shown in > - Figure~\ref{fig:together:Localized Correlated Measurement Fields}: > + Listing~\ref{lst:together:Localized Correlated Measurement Fields}: > Simply embed an instance of a \co{measurement} structure > named \co{meas} > into the \co{animal} structure, and point the \co{->mp} > diff --git a/together/refcnt.tex b/together/refcnt.tex > index 626c375..8a7fa1d 100644 > --- a/together/refcnt.tex > +++ b/together/refcnt.tex > @@ -149,12 +149,12 @@ of compiler optimizations. > This is the method of choice when the lock is required to protect > other operations in addition to the reference count, but where > a reference to the object must be held after the lock is released. > -Figure~\ref{fig:together:Simple Reference-Count API} shows a simple > +Listing~\ref{lst:together:Simple Reference-Count API} shows a simple > API that might be used to implement simple non-atomic reference > counting---although simple reference counting is almost always > open-coded instead. > > -\begin{figure}[tbp] > +\begin{listing}[tbp] > { \scriptsize > \begin{verbbox} > 1 struct sref { > @@ -188,8 +188,8 @@ open-coded instead. > \centering > \theverbbox > \caption{Simple Reference-Count API} > -\label{fig:together:Simple Reference-Count API} > -\end{figure} > +\label{lst:together:Simple Reference-Count API} > +\end{listing} > > \subsubsection{Atomic Counting} > \label{sec:together:Atomic Counting} > @@ -203,7 +203,7 @@ Any CPU that hands the object off must first acquire a new reference > on behalf of the recipient object. > In the Linux kernel, the \co{kref} primitives are used to implement > this style of reference counting, as shown in > -Figure~\ref{fig:together:Linux Kernel kref API}. > +Listing~\ref{lst:together:Linux Kernel kref API}. > > Atomic counting is required > because locking is not used to protect all reference-count operations, > @@ -243,10 +243,10 @@ RCU read-side critical sections. > there cannot possibly be any CPU that is permitted > to acquire a new reference. > This same fact allows the non-atomic check in line~22 > - of Figure~\ref{fig:together:Linux Kernel kref API}. > + of Listing~\ref{lst:together:Linux Kernel kref API}. > } \QuickQuizEnd > > -\begin{figure}[tbp] > +\begin{listing}[tbp] > { \scriptsize > \begin{verbbox} > 1 struct kref { > @@ -282,12 +282,12 @@ RCU read-side critical sections. > \centering > \theverbbox > \caption{Linux Kernel \tco{kref} API} > -\label{fig:together:Linux Kernel kref API} > -\end{figure} > +\label{lst:together:Linux Kernel kref API} > +\end{listing} > > The \co{kref} structure itself, consisting of a single atomic > data item, is shown in lines~1-3 of > -Figure~\ref{fig:together:Linux Kernel kref API}. > +Listing~\ref{lst:together:Linux Kernel kref API}. > The \co{kref_init()} function on lines~5-8 initializes the counter > to the value ``1''. > Note that the \co{atomic_set()} primitive is a simple > @@ -314,7 +314,7 @@ Otherwise, \co{kref_sub()} returns zero, informing the caller that > \QuickQuiz{} > Suppose that just after the \co{atomic_sub_and_test()} > on line~22 of > - Figure~\ref{fig:together:Linux Kernel kref API} is invoked, > + Listing~\ref{lst:together:Linux Kernel kref API} is invoked, > that some other CPU invokes \co{kref_get()}. > Doesn't this result in that other CPU now having an illegal > reference to a released object? > @@ -357,9 +357,9 @@ layer to track the destination caches that are used in packet routing. > The actual implementation is quite a bit more involved; this section > focuses on the aspects of \co{struct dst_entry} reference-count > handling that matches this use case, > -shown in Figure~\ref{fig:together:Linux Kernel dst-clone API}. > +shown in Listing~\ref{lst:together:Linux Kernel dst-clone API}. > > -\begin{figure}[tbp] > +\begin{listing}[tbp] > { \scriptsize > \begin{verbbox} > 1 static inline > @@ -384,8 +384,8 @@ shown in Figure~\ref{fig:together:Linux Kernel dst-clone API}. > \centering > \theverbbox > \caption{Linux Kernel \tco{dst_clone} API} > -\label{fig:together:Linux Kernel dst-clone API} > -\end{figure} > +\label{lst:together:Linux Kernel dst-clone API} > +\end{listing} > > The \co{dst_clone()} primitive may be used if the caller > already has a reference to the specified \co{dst_entry}, > @@ -443,9 +443,9 @@ as shown below. > The Linux kernel's \co{fget()} and \co{fput()} primitives > use this style of reference counting. > Simplified versions of these functions are shown in > -Figure~\ref{fig:together:Linux Kernel fget/fput API}. > +Listing~\ref{lst:together:Linux Kernel fget/fput API}. > > -\begin{figure}[tbp] > +\begin{listing}[tbp] > { \fontsize{6.5pt}{7.5pt}\selectfont > \begin{verbbox} > 1 struct file *fget(unsigned int fd) > @@ -494,8 +494,8 @@ Figure~\ref{fig:together:Linux Kernel fget/fput API}. > \centering > \theverbbox > \caption{Linux Kernel \tco{fget}/\tco{fput} API} > -\label{fig:together:Linux Kernel fget/fput API} > -\end{figure} > +\label{lst:together:Linux Kernel fget/fput API} > +\end{listing} > > Line~4 of \co{fget()} fetches the pointer to the current > process's file-descriptor table, which might well be shared > -- > 2.7.4 > -- To unsubscribe from this list: send the line "unsubscribe perfbook" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html