Also fix indents by white spaces in Quick Quizzes. Signed-off-by: Akira Yokosawa <akiyks@xxxxxxxxx> --- defer/defer.tex | 10 ++--- defer/hazptr.tex | 108 +++++++++++++++++++++++----------------------- defer/refcnt.tex | 68 ++++++++++++++--------------- defer/seqlock.tex | 80 +++++++++++++++++----------------- 4 files changed, 132 insertions(+), 134 deletions(-) diff --git a/defer/defer.tex b/defer/defer.tex index 3fecc2ac..291e091a 100644 --- a/defer/defer.tex +++ b/defer/defer.tex @@ -52,11 +52,11 @@ a quadruple consisting of source and destination IP addresses and ports all the way down to a simple integer. The value looked up and returned will also be a simple integer, so that the data structure is as shown in -Figure~\ref{fig:defer:Pre-BSD Packet Routing List}, which +\cref{fig:defer:Pre-BSD Packet Routing List}, which directs packets with address~42 to interface~1, address~56 to interface~3, and address~17 to interface~7. This list will normally be searched frequently and updated rarely. -In Chapter~\ref{chp:Hardware and its Habits} +In \cref{chp:Hardware and its Habits} we learned that the best ways to evade inconvenient laws of physics, such as the finite speed of light and the atomic nature of matter, is to either partition the data or to rely on read-mostly sharing. @@ -76,12 +76,12 @@ routing. \label{lst:defer:Sequential Pre-BSD Routing Table} \end{listing} -Listing~\ref{lst:defer:Sequential Pre-BSD Routing Table} (\path{route_seq.c}) +\Cref{lst:defer:Sequential Pre-BSD Routing Table} (\path{route_seq.c}) shows a simple single-threaded implementation corresponding to -Figure~\ref{fig:defer:Pre-BSD Packet Routing List}. +\cref{fig:defer:Pre-BSD Packet Routing List}. \begin{fcvref}[ln:defer:route_seq:lookup_add_del:entry] \Clnrefrange{b}{e} define a \co{route_entry} structure and -line~\lnref{header} defines +\clnref{header} defines the \co{route_list} header. \end{fcvref} \begin{fcvref}[ln:defer:route_seq:lookup_add_del:lookup] diff --git a/defer/hazptr.tex b/defer/hazptr.tex index 7c7dd831..47acb0ac 100644 --- a/defer/hazptr.tex +++ b/defer/hazptr.tex @@ -33,13 +33,13 @@ out quite carefully in order to avoid destructive races with concurrent deletion. \begin{fcvref}[ln:defer:hazptr:record_clear] One implementation is shown in -Listing~\ref{lst:defer:Hazard-Pointer Recording and Clearing}, +\cref{lst:defer:Hazard-Pointer Recording and Clearing}, which shows \co{hp_try_record()} on \clnrefrange{htr:b}{htr:e}, \co{hp_record()} on \clnrefrange{hr:b}{hr:e}, and \co{hp_clear()} on \clnrefrange{hc:b}{hc:e} (\path{hazptr.h}). -The \co{hp_try_record()} macro on line~\lnref{htr:e} is simply a casting +The \co{hp_try_record()} macro on \clnref{htr:e} is simply a casting wrapper for the \co{_h_t_r_impl()} function, which attempts to store the pointer referenced by \co{p} into the hazard pointer referenced by \co{hp}. @@ -68,25 +68,25 @@ Finally, if it fails due to racing with an update, it returns a special This approach allows read-side code to be simpler and faster. }\QuickQuizEnd -Line~\lnref{htr:ro1} reads the pointer to the object to be protected. -If line~\lnref{htr:race1} finds that this pointer was either \co{NULL} or +\Clnref{htr:ro1} reads the pointer to the object to be protected. +If \clnref{htr:race1} finds that this pointer was either \co{NULL} or the special \co{HAZPTR_POISON} deleted-object token, it returns the pointer's value to inform the caller of the failure. -Otherwise, line~\lnref{htr:store} stores the pointer into the specified -hazard pointer, and line~\lnref{htr:mb} forces full ordering of that -store with the reload of the original pointer on line~\lnref{htr:ro2}. -(See Chapter~\ref{chp:Advanced Synchronization: Memory Ordering} +Otherwise, \clnref{htr:store} stores the pointer into the specified +hazard pointer, and \clnref{htr:mb} forces full ordering of that +store with the reload of the original pointer on \clnref{htr:ro2}. +(See \cref{chp:Advanced Synchronization: Memory Ordering} for more information on memory ordering.) If the value of the original pointer has not changed, then the hazard pointer protects the pointed-to object, and in that case, -line~\lnref{htr:success} returns a pointer to that object, which also +\clnref{htr:success} returns a pointer to that object, which also indicates success to the caller. Otherwise, if the pointer changed between the two \co{READ_ONCE()} -invocations, line~\lnref{htr:race2} indicates failure. +invocations, \clnref{htr:race2} indicates failure. \QuickQuiz{ Why does \co{hp_try_record()} in - Listing~\ref{lst:defer:Hazard-Pointer Recording and Clearing} + \cref{lst:defer:Hazard-Pointer Recording and Clearing} take a double indirection to the data element? Why not \co{void *} instead of \co{void **}? }\QuickQuizAnswer{ @@ -127,13 +127,13 @@ Once a hazard-pointer-protected object has been removed from its linked data structure, so that it is now inaccessible to future hazard-pointer readers, it is passed to \co{hazptr_free_later()}, which is shown on \clnrefrange{b}{e} of -Listing~\ref{lst:defer:Hazard-Pointer Scanning and Freeing} +\cref{lst:defer:Hazard-Pointer Scanning and Freeing} (\path{hazptr.c}). -Lines~\lnref{enq:b} and~\lnref{enq:e} +\Clnref{enq:b,enq:e} enqueue the object on a per-thread list \co{rlist} -and line~\lnref{count} counts the object in \co{rcount}. -If line~\lnref{check} sees that a sufficiently large number of objects are now -queued, line~\lnref{scan} invokes \co{hazptr_scan()} to attempt to +and \clnref{count} counts the object in \co{rcount}. +If \clnref{check} sees that a sufficiently large number of objects are now +queued, \clnref{scan} invokes \co{hazptr_scan()} to attempt to free some of them. \end{fcvref} @@ -146,32 +146,32 @@ which allows a fixed-size array of hazard pointers to be used. Because any thread might need to scan the hazard pointers, each thread maintains its own array, which is referenced by the per-thread variable \co{gplist}. -If line~\lnref{check} determines that this thread has not yet allocated its +If \clnref{check} determines that this thread has not yet allocated its \co{gplist}, \clnrefrange{alloc:b}{alloc:e} carry out the allocation. -The memory barrier on line~\lnref{mb1} ensures that all threads see the +The memory barrier on \clnref{mb1} ensures that all threads see the removal of all objects by this thread before \clnrefrange{loop:b}{loop:e} scan all of the hazard pointers, accumulating non-NULL pointers into the \co{plist} array and counting them in \co{psize}. -The memory barrier on line~\lnref{mb2} ensures that the reads of +The memory barrier on \clnref{mb2} ensures that the reads of the hazard pointers happen before any objects are freed. -Line~\lnref{sort} then sorts this array to enable use of binary search below. +\Clnref{sort} then sorts this array to enable use of binary search below. -Lines~\lnref{rem:b} and~\lnref{rem:e} +\Clnref{rem:b,rem:e} remove all elements from this thread's list of to-be-freed objects, placing them on the local \co{tmplist} -and line~\lnref{zero} zeroes the count. +and \clnref{zero} zeroes the count. Each pass through the loop spanning \clnrefrange{loop2:b}{loop2:e} processes each of the to-be-freed objects. -Lines~\lnref{rem1st:b} and~\lnref{rem1st:e} +\Clnref{rem1st:b,rem1st:e} remove the first object from \co{tmplist}, -and if lines~\lnref{chkhazp:b} and~\lnref{chkhazp:e} +and if \clnref{chkhazp:b,chkhazp:e} determine that there is a hazard pointer protecting this object, \clnrefrange{back:b}{back:e} place it back onto \co{rlist}. -Otherwise, line~\lnref{free} frees the object. +Otherwise, \clnref{free} frees the object. \end{fcvref} \begin{listing} @@ -181,37 +181,37 @@ Otherwise, line~\lnref{free} frees the object. \end{listing} The Pre-BSD routing example can use hazard pointers as shown in -Listing~\ref{lst:defer:Hazard-Pointer Pre-BSD Routing Table Lookup} +\cref{lst:defer:Hazard-Pointer Pre-BSD Routing Table Lookup} for data structures and \co{route_lookup()}, and in -Listing~\ref{lst:defer:Hazard-Pointer Pre-BSD Routing Table Add/Delete} +\cref{lst:defer:Hazard-Pointer Pre-BSD Routing Table Add/Delete} for \co{route_add()} and \co{route_del()} (\path{route_hazptr.c}). As with reference counting, the hazard-pointers implementation is quite similar to the sequential algorithm shown in -Listing~\ref{lst:defer:Sequential Pre-BSD Routing Table} +\cref{lst:defer:Sequential Pre-BSD Routing Table} on -page~\pageref{lst:defer:Sequential Pre-BSD Routing Table}, +\cpageref{lst:defer:Sequential Pre-BSD Routing Table}, so only differences will be discussed. \begin{fcvref}[ln:defer:route_hazptr:lookup] Starting with -Listing~\ref{lst:defer:Hazard-Pointer Pre-BSD Routing Table Lookup}, -line~\lnref{hh} shows the \co{->hh} field used to queue objects pending +\cref{lst:defer:Hazard-Pointer Pre-BSD Routing Table Lookup}, +\clnref{hh} shows the \co{->hh} field used to queue objects pending hazard-pointer free, -line~\lnref{re_freed} shows the \co{->re_freed} field used to detect -use-after-free bugs, and line~\lnref{tryrecord} invokes +\clnref{re_freed} shows the \co{->re_freed} field used to detect +use-after-free bugs, and \clnref{tryrecord} invokes \co{hp_try_record()} to attempt to acquire a hazard pointer. -If the return value is \co{NULL}, line~\lnref{NULL} returns a not-found +If the return value is \co{NULL}, \clnref{NULL} returns a not-found indication to the caller. -If the call to \co{hp_try_record()} raced with deletion, line~\lnref{deleted} -branches back to line~\lnref{retry}'s \co{retry} to re-traverse the list +If the call to \co{hp_try_record()} raced with deletion, \clnref{deleted} +branches back to \clnref{retry}'s \co{retry} to re-traverse the list from the beginning. The \co{do}--\co{while} loop falls through when the desired element is -located, but if this element has already been freed, line~\lnref{abort} +located, but if this element has already been freed, \clnref{abort} terminates the program. Otherwise, the element's \co{->iface} field is returned to the caller. -Note that line~\lnref{tryrecord} invokes \co{hp_try_record()} rather +Note that \clnref{tryrecord} invokes \co{hp_try_record()} rather than the easier-to-use \co{hp_record()}, restarting the full search upon \co{hp_try_record()} failure. And such restarting is absolutely required for correctness. @@ -283,9 +283,9 @@ courtesy of the fact that the hazard pointers are stored local to each CPU or thread, which in turn allows traversals to be carried out without any writes to the data structures being traversed. Referring back to -Figure~\ref{fig:count:Optimization and the Four Parallel-Programming Tasks} +\cref{fig:count:Optimization and the Four Parallel-Programming Tasks} on -page~\pageref{fig:count:Optimization and the Four Parallel-Programming Tasks}, +\cpageref{fig:count:Optimization and the Four Parallel-Programming Tasks}, hazard pointers enable the CPU caches to do resource replication, which in turn allows weakening of the parallel-access-control mechanism, thus boosting performance and scalability. @@ -295,7 +295,7 @@ minimal memory footprint: Any object not currently referenced by some hazard pointer may be immediately freed. In contrast, -Section~\ref{sec:defer:Read-Copy Update (RCU)} +\cref{sec:defer:Read-Copy Update (RCU)} will discuss a mechanism that avoids read-side retries (and minimizes read-side overhead), but which can result in a much larger memory footprint. @@ -308,15 +308,15 @@ footprint. \begin{fcvref}[ln:defer:route_hazptr:add_del] The \co{route_add()} and \co{route_del()} functions are shown in -Listing~\ref{lst:defer:Hazard-Pointer Pre-BSD Routing Table Add/Delete}. -Line~\lnref{init_freed} initializes \co{->re_freed}, -line~\lnref{poison} poisons the \co{->re_next} field of the newly removed +\cref{lst:defer:Hazard-Pointer Pre-BSD Routing Table Add/Delete}. +\Clnref{init_freed} initializes \co{->re_freed}, +\clnref{poison} poisons the \co{->re_next} field of the newly removed object, and -line~\lnref{free_later} passes that object to the +\clnref{free_later} passes that object to the \co{hazptr_free_later()} function, which will free that object once it is safe to do so. The spinlocks work the same as in -Listing~\ref{lst:defer:Reference-Counted Pre-BSD Routing Table Add/Delete}. +\cref{lst:defer:Reference-Counted Pre-BSD Routing Table Add/Delete}. \end{fcvref} \begin{figure} @@ -326,10 +326,10 @@ Listing~\ref{lst:defer:Reference-Counted Pre-BSD Routing Table Add/Delete}. \label{fig:defer:Pre-BSD Routing Table Protected by Hazard Pointers} \end{figure} -Figure~\ref{fig:defer:Pre-BSD Routing Table Protected by Hazard Pointers} +\Cref{fig:defer:Pre-BSD Routing Table Protected by Hazard Pointers} shows the hazard-pointers-protected Pre-BSD routing algorithm's performance on the same read-only workload as for -Figure~\ref{fig:defer:Pre-BSD Routing Table Protected by Reference Counting}. +\cref{fig:defer:Pre-BSD Routing Table Protected by Reference Counting}. Although hazard pointers scale far better than does reference counting, hazard pointers still require readers to do writes to shared memory (albeit with much improved locality of reference), @@ -341,12 +341,12 @@ reference-counting, hazard pointers not only operate correctly for workloads involving concurrent updates, but also exhibit excellent scalability. Additional performance comparisons with other mechanisms may be found in -Chapter~\ref{chp:Data Structures} +\cref{chp:Data Structures} and in other publications~\cite{ThomasEHart2007a,McKenney:2013:SDS:2483852.2483867,MagedMichael04a}. \QuickQuizSeries{% \QuickQuizB{ - Figure~\ref{fig:defer:Pre-BSD Routing Table Protected by Hazard Pointers} + \Cref{fig:defer:Pre-BSD Routing Table Protected by Hazard Pointers} shows no sign of hyperthread-induced flattening at 224 threads. Why is that? }\QuickQuizAnswerB{ @@ -365,21 +365,21 @@ and in other publications~\cite{ThomasEHart2007a,McKenney:2013:SDS:2483852.24838 Procrastination''~\cite{McKenney:2013:SDS:2483852.2483867} shows that hazard pointers have near-ideal performance. Whatever happened in - Figure~\ref{fig:defer:Pre-BSD Routing Table Protected by Hazard Pointers}??? + \cref{fig:defer:Pre-BSD Routing Table Protected by Hazard Pointers}??? }\QuickQuizAnswerE{ First, - Figure~\ref{fig:defer:Pre-BSD Routing Table Protected by Hazard Pointers} + \cref{fig:defer:Pre-BSD Routing Table Protected by Hazard Pointers} has a linear y-axis, while most of the graphs in the ``Structured Deferral'' paper have logscale y-axes. Next, that paper uses lightly-loaded hash tables, while - Figure~\ref{fig:defer:Pre-BSD Routing Table Protected by Hazard Pointers}'s + \cref{fig:defer:Pre-BSD Routing Table Protected by Hazard Pointers}'s uses a 10-element simple linked list, which means that hazard pointers face a larger memory-barrier penalty in this workload than in that of the ``Structured Deferral'' paper. Finally, that paper used an older modest-sized x86 system, while a much newer and larger system was used to generate the data shown in - Figure~\ref{fig:defer:Pre-BSD Routing Table Protected by Hazard Pointers}. + \cref{fig:defer:Pre-BSD Routing Table Protected by Hazard Pointers}. In addition, use of pairwise asymmetric barriers~\cite{Windows2008FlushProcessWriteBuffers,JonathanCorbet2010sys-membarrier,Linuxmanpage2018sys-membarrier} diff --git a/defer/refcnt.tex b/defer/refcnt.tex index 4f815266..76f49212 100644 --- a/defer/refcnt.tex +++ b/defer/refcnt.tex @@ -36,23 +36,23 @@ Reference counting is thus an excellent time-honored candidate for a concurrent implementation of Pre-BSD routing. To that end, -Listing~\ref{lst:defer:Reference-Counted Pre-BSD Routing Table Lookup} +\cref{lst:defer:Reference-Counted Pre-BSD Routing Table Lookup} shows data structures and the \co{route_lookup()} function and -Listing~\ref{lst:defer:Reference-Counted Pre-BSD Routing Table Add/Delete} +\cref{lst:defer:Reference-Counted Pre-BSD Routing Table Add/Delete} shows the \co{route_add()} and \co{route_del()} functions (all at \path{route_refcnt.c}). Since these algorithms are quite similar to the sequential algorithm shown in -Listing~\ref{lst:defer:Sequential Pre-BSD Routing Table}, +\cref{lst:defer:Sequential Pre-BSD Routing Table}, only the differences will be discussed. \begin{fcvref}[ln:defer:route_refcnt:lookup:entry] Starting with -Listing~\ref{lst:defer:Reference-Counted Pre-BSD Routing Table Lookup}, -line~\lnref{refcnt} adds the actual reference counter, -line~\lnref{freed} adds a \co{->re_freed} +\cref{lst:defer:Reference-Counted Pre-BSD Routing Table Lookup}, +\clnref{refcnt} adds the actual reference counter, +\clnref{freed} adds a \co{->re_freed} use-after-free check field, -line~\lnref{routelock} adds the \co{routelock} that will +\clnref{routelock} adds the \co{routelock} that will be used to synchronize concurrent updates, \end{fcvref} \begin{fcvref}[ln:defer:route_refcnt:lookup:re_free] @@ -65,8 +65,7 @@ In \co{route_lookup()} itself, \clnrefrange{relprev:b}{relprev:e} release the reference count of the prior element and free it if the count becomes zero, and \clnrefrange{acq:b}{acq:e} acquire a reference on the new element, -with lines~\lnref{check_uaf} -and~\lnref{abort} performing the use-after-free check. +with \clnref{check_uaf,abort} performing the use-after-free check. \end{fcvref} \QuickQuiz{ @@ -77,22 +76,21 @@ and~\lnref{abort} performing the use-after-free check. (\path{routetorture.h}) that allocates and frees only one type of structure can tolerate a surprisingly large amount of use-after-free misbehavior. - See Figure~\ref{fig:debugging:Number of Tests Required for 99 Percent Confidence Given Failure Rate} - on page~\pageref{fig:debugging:Number of Tests Required for 99 Percent Confidence Given Failure Rate} + See \cref{fig:debugging:Number of Tests Required for 99 Percent Confidence Given Failure Rate} + on \cpageref{fig:debugging:Number of Tests Required for 99 Percent Confidence Given Failure Rate} and the related discussion in - Section~\ref{sec:debugging:Hunting Heisenbugs} + \cref{sec:debugging:Hunting Heisenbugs} starting on - page~\pageref{sec:debugging:Hunting Heisenbugs} + \cpageref{sec:debugging:Hunting Heisenbugs} for more on the importance of increasing the probability of finding bugs. }\QuickQuizEnd \begin{fcvref}[ln:defer:route_refcnt:add_del] -In Listing~\ref{lst:defer:Reference-Counted Pre-BSD Routing Table Add/Delete}, -lines~\lnref{acq1}, \lnref{rel1}, \lnref{acq2}, \lnref{rel2}, -and~\lnref{rel3} introduce locking to synchronize +In \cref{lst:defer:Reference-Counted Pre-BSD Routing Table Add/Delete}, +\clnref{acq1,rel1,acq2,rel2,rel3} introduce locking to synchronize concurrent updates. -Line~\lnref{init:freed} initializes the \co{->re_freed} use-after-free-check field, +\Clnref{init:freed} initializes the \co{->re_freed} use-after-free-check field, and finally \clnrefrange{re_free:b}{re_free:e} invoke \co{re_free()} if the new value of the reference count is zero. @@ -100,7 +98,7 @@ the reference count is zero. \QuickQuiz{ Why doesn't \co{route_del()} in - Listing~\ref{lst:defer:Reference-Counted Pre-BSD Routing Table Add/Delete} + \cref{lst:defer:Reference-Counted Pre-BSD Routing Table Add/Delete} use reference counts to protect the traversal to the element to be freed? }\QuickQuizAnswer{ @@ -115,18 +113,18 @@ the reference count is zero. \label{fig:defer:Pre-BSD Routing Table Protected by Reference Counting} \end{figure} -Figure~\ref{fig:defer:Pre-BSD Routing Table Protected by Reference Counting} +\Cref{fig:defer:Pre-BSD Routing Table Protected by Reference Counting} shows the performance and scalability of reference counting on a read-only workload with a ten-element list running on an eight-socket 28-core-per-socket hyperthreaded 2.1\,GHz x86 system with a total of 448 hardware threads (\path{hps.2019.12.02a/lscpu.hps}). The ``ideal'' trace was generated by running the sequential code shown in -Listing~\ref{lst:defer:Sequential Pre-BSD Routing Table}, +\cref{lst:defer:Sequential Pre-BSD Routing Table}, which works only because this is a read-only workload. The reference-counting performance is abysmal and its scalability even more so, with the ``refcnt'' trace indistinguishable from the x-axis. This should be no surprise in view of -Chapter~\ref{chp:Hardware and its Habits}: +\cref{chp:Hardware and its Habits}: The reference-count acquisitions and releases have added frequent shared-memory writes to an otherwise read-only workload, thus incurring severe retribution from the laws of physics. @@ -137,7 +135,7 @@ the atoms used in modern digital electronics. \QuickQuizSeries{% \QuickQuizB{ Why the break in the ``ideal'' line at 224 CPUs in - Figure~\ref{fig:defer:Pre-BSD Routing Table Protected by Reference Counting}? + \cref{fig:defer:Pre-BSD Routing Table Protected by Reference Counting}? Shouldn't it be a straight line? }\QuickQuizAnswerB{ The break is due to hyperthreading. @@ -173,7 +171,7 @@ the atoms used in modern digital electronics. % \QuickQuizE{ Shouldn't the refcnt trace in - Figure~\ref{fig:defer:Pre-BSD Routing Table Protected by Reference Counting} + \cref{fig:defer:Pre-BSD Routing Table Protected by Reference Counting} be at least a little bit off of the x-axis??? }\QuickQuizAnswerE{ Define ``a little bit.'' @@ -185,16 +183,16 @@ the atoms used in modern digital electronics. \label{fig:defer:Pre-BSD Routing Table Protected by Reference Counting; Log Scale} \end{figure} - Figure~\ref{fig:defer:Pre-BSD Routing Table Protected by Reference Counting; Log Scale} + \Cref{fig:defer:Pre-BSD Routing Table Protected by Reference Counting; Log Scale} shows the same data, but on a log-log plot. As you can see, the refcnt line drops below 5,000 at two CPUs. This means that the refcnt performance at two CPUs is more than one thousand times smaller than the first y-axis tick of $5 \times 10^6$ in - Figure~\ref{fig:defer:Pre-BSD Routing Table Protected by Reference Counting}. + \cref{fig:defer:Pre-BSD Routing Table Protected by Reference Counting}. Therefore, the depiction of the performance of reference counting shown in - Figure~\ref{fig:defer:Pre-BSD Routing Table Protected by Reference Counting} + \cref{fig:defer:Pre-BSD Routing Table Protected by Reference Counting} is all too accurate. }\QuickQuizEndE } @@ -204,8 +202,8 @@ But it gets worse. Running multiple updater threads repeatedly invoking \co{route_add()} and \co{route_del()} will quickly encounter the \co{abort()} statement on -line~\ref{ln:defer:route_refcnt:lookup:lookup:abort} of -Listing~\ref{lst:defer:Reference-Counted Pre-BSD Routing Table Lookup}, +\clnrefr{ln:defer:route_refcnt:lookup:lookup:abort} of +\cref{lst:defer:Reference-Counted Pre-BSD Routing Table Lookup}, which indicates a use-after-free bug. This in turn means that the reference counts are not only profoundly degrading scalability and performance, but also failing to provide @@ -213,27 +211,27 @@ the needed protection. One sequence of events leading to the use-after-free bug is as follows, given the list shown in -Figure~\ref{fig:defer:Pre-BSD Packet Routing List}: +\cref{fig:defer:Pre-BSD Packet Routing List}: \begin{fcvref}[ln:defer:route_refcnt:lookup] \begin{enumerate} \item Thread~A looks up address~42, reaching - line~\lnref{lookup:check_NULL} of + \clnref{lookup:check_NULL} of \co{route_lookup()} in - Listing~\ref{lst:defer:Reference-Counted Pre-BSD Routing Table Lookup}. + \cref{lst:defer:Reference-Counted Pre-BSD Routing Table Lookup}. In other words, Thread~A has a pointer to the first element, but has not yet acquired a reference to it. \item Thread~B invokes \co{route_del()} in - Listing~\ref{lst:defer:Reference-Counted Pre-BSD Routing Table Add/Delete} + \cref{lst:defer:Reference-Counted Pre-BSD Routing Table Add/Delete} to delete the route entry for address~42. It completes successfully, and because this entry's \co{->re_refcnt} field was equal to the value one, it invokes \co{re_free()} to set the \co{->re_freed} field and to free the entry. \item Thread~A continues execution of \co{route_lookup()}. Its \co{rep} pointer is non-\co{NULL}, but - line~\lnref{lookup:check_uaf} sees that + \clnref{lookup:check_uaf} sees that its \co{->re_freed} field is non-zero, - so line~\lnref{lookup:abort} invokes + so \clnref{lookup:abort} invokes \co{abort()}. \end{enumerate} \end{fcvref} @@ -263,7 +261,7 @@ of reference counting! ``eliminated the usefulness'', now didn't it? Please see - Section~\ref{sec:together:Refurbish Reference Counting}, + \cref{sec:together:Refurbish Reference Counting}, which discusses some of the techniques that the Linux kernel uses to take advantage of reference counting in a highly concurrent environment. diff --git a/defer/seqlock.tex b/defer/seqlock.tex index 71f506a3..3eec7b2f 100644 --- a/defer/seqlock.tex +++ b/defer/seqlock.tex @@ -17,7 +17,7 @@ However, unlike reader-writer locking, readers do not exclude writers. Instead, like hazard pointers, sequence locks force readers to \emph{retry} an operation if they detect activity from a concurrent writer. As can be seen from -Figure~\ref{fig:defer:Reader And Uncooperative Sequence Lock}, +\cref{fig:defer:Reader And Uncooperative Sequence Lock}, it is important to design code using sequence locks so that readers very rarely need to retry. @@ -29,7 +29,7 @@ very rarely need to retry. \end{figure} \QuickQuiz{ - Why isn't this sequence-lock discussion in Chapter~\ref{chp:Locking}, + Why isn't this sequence-lock discussion in \cref{chp:Locking}, you know, the one on \emph{locking}? }\QuickQuizAnswer{ The sequence-lock mechanism is really a combination of two @@ -84,12 +84,12 @@ If either snapshot has an odd value, or if the two snapshots differ, there has been a concurrent update, and the reader must discard the results of the access and then retry it. Readers therefore use the \co{read_seqbegin()} and \co{read_seqretry()} -functions shown in Listing~\ref{lst:defer:Sequence-Locking Reader} +functions shown in \cref{lst:defer:Sequence-Locking Reader} when accessing data protected by a sequence lock. Writers must increment the value before and after each update, and only one writer is permitted at a given time. Writers therefore use the \co{write_seqlock()} and \co{write_sequnlock()} -functions shown in Listing~\ref{lst:defer:Sequence-Locking Writer} +functions shown in \cref{lst:defer:Sequence-Locking Writer} when updating data protected by a sequence lock. As a result, sequence-lock-protected data can have an arbitrarily @@ -105,7 +105,7 @@ It is also used in pathname traversal to detect concurrent rename operations. \end{listing} A simple implementation of sequence locks is shown in -Listing~\ref{lst:defer:Sequence-Locking Implementation} +\cref{lst:defer:Sequence-Locking Implementation} (\path{seqlock.h}). \begin{fcvref}[ln:defer:seqlock:impl:typedef] The \co{seqlock_t} data structure is shown on @@ -120,17 +120,17 @@ initializes a \co{seqlock_t}. \begin{fcvref}[ln:defer:seqlock:impl:read_seqbegin] \Clnrefrange{b}{e} show \co{read_seqbegin()}, which begins a sequence-lock \IXh{read-side}{critical section}. -Line~\lnref{fetch} takes a snapshot of the sequence counter, and -line~\lnref{mb} orders +\Clnref{fetch} takes a snapshot of the sequence counter, and +\clnref{mb} orders this snapshot operation before the caller's critical section. -Finally, line~\lnref{ret} returns the value of the snapshot (with the least-significant +Finally, \clnref{ret} returns the value of the snapshot (with the least-significant bit cleared), which the caller will pass to a later call to \co{read_seqretry()}. \end{fcvref} \QuickQuiz{ Why not have \co{read_seqbegin()} in - Listing~\ref{lst:defer:Sequence-Locking Implementation} + \cref{lst:defer:Sequence-Locking Implementation} check for the low-order bit being set, and retry internally, rather than allowing a doomed read to start? }\QuickQuizAnswer{ @@ -147,9 +147,9 @@ will pass to a later call to \co{read_seqretry()}. \Clnrefrange{b}{e} show \co{read_seqretry()}, which returns true if there was at least one writer since the time of the corresponding call to \co{read_seqbegin()}. -Line~\lnref{mb} orders the caller's prior critical section before line~\lnref{fetch}'s +\Clnref{mb} orders the caller's prior critical section before \clnref{fetch}'s fetch of the new snapshot of the sequence counter. -Line~\lnref{ret} checks whether the sequence counter has changed, +\Clnref{ret} checks whether the sequence counter has changed, in other words, whether there has been at least one writer, and returns true if so. \end{fcvref} @@ -157,8 +157,8 @@ true if so. \QuickQuizSeries{% \QuickQuizB{ Why is the \co{smp_mb()} on - line~\ref{ln:defer:seqlock:impl:read_seqretry:mb} of - Listing~\ref{lst:defer:Sequence-Locking Implementation} + \clnrefr{ln:defer:seqlock:impl:read_seqretry:mb} of + \cref{lst:defer:Sequence-Locking Implementation} needed? }\QuickQuizAnswerB{ If it was omitted, both the compiler and the CPU would be @@ -171,18 +171,18 @@ true if so. % \QuickQuizM{ Can't weaker memory barriers be used in the code in - Listing~\ref{lst:defer:Sequence-Locking Implementation}? + \cref{lst:defer:Sequence-Locking Implementation}? }\QuickQuizAnswerM{ In older versions of the Linux kernel, no. \begin{fcvref}[ln:defer:seqlock:impl] In very new versions of the Linux kernel, - line~\lnref{read_seqbegin:fetch} could use + \clnref{read_seqbegin:fetch} could use \co{smp_load_acquire()} instead of \co{READ_ONCE()}, which in turn would allow the \co{smp_mb()} on - line~\lnref{read_seqbegin:mb} to be dropped. - Similarly, line~\lnref{write_sequnlock:inc} could use an - \co{smp_store_release()}, for + \clnref{read_seqbegin:mb} to be dropped. + Similarly, \clnref{write_sequnlock:inc} could use an + \co{smp_store_release()}, for example, as follows: \begin{VerbatimU} @@ -190,7 +190,7 @@ smp_store_release(&slp->seq, READ_ONCE(slp->seq) + 1); \end{VerbatimU} This would allow the \co{smp_mb()} on - line~\lnref{write_sequnlock:mb} to be dropped. + \clnref{write_sequnlock:mb} to be dropped. \end{fcvref} }\QuickQuizEndM % @@ -214,7 +214,7 @@ that this increment is ordered before the caller's critical section. \begin{fcvref}[ln:defer:seqlock:impl:write_sequnlock] \Clnrefrange{b}{e} show \co{write_sequnlock()}, which executes a memory barrier to ensure that the caller's critical section is ordered before the -increment of the sequence number on line~\lnref{inc}, then releases the lock. +increment of the sequence number on \clnref{inc}, then releases the lock. \end{fcvref} \QuickQuizSeries{% @@ -228,8 +228,8 @@ increment of the sequence number on line~\lnref{inc}, then releases the lock. % \QuickQuizE{ Why isn't \co{seq} on - line~\ref{ln:defer:seqlock:impl:typedef:seq} of - Listing~\ref{lst:defer:Sequence-Locking Implementation} + \clnrefr{ln:defer:seqlock:impl:typedef:seq} of + \cref{lst:defer:Sequence-Locking Implementation} \co{unsigned} rather than \co{unsigned long}? After all, if \co{unsigned} is good enough for the Linux kernel, shouldn't it be good enough for everyone? @@ -240,7 +240,7 @@ increment of the sequence number on line~\lnref{inc}, then releases the lock. \begin{enumerate} \item Thread~0 executes \co{read_seqbegin()}, picking up \co{->seq} in - line~\ref{ln:defer:seqlock:impl:read_seqbegin:fetch}, + \clnrefr{ln:defer:seqlock:impl:read_seqbegin:fetch}, noting that the value is even, and thus returning to the caller. \item Thread~0 starts executing its read-side critical section, @@ -286,34 +286,34 @@ increment of the sequence number on line~\lnref{inc}, then releases the lock. So what happens when sequence locking is applied to the Pre-BSD routing table? -Listing~\ref{lst:defer:Sequence-Locked Pre-BSD Routing Table Lookup} +\Cref{lst:defer:Sequence-Locked Pre-BSD Routing Table Lookup} shows the data structures and \co{route_lookup()}, and -Listing~\ref{lst:defer:Sequence-Locked Pre-BSD Routing Table Add/Delete} +\cref{lst:defer:Sequence-Locked Pre-BSD Routing Table Add/Delete} shows \co{route_add()} and \co{route_del()} (\path{route_seqlock.c}). This implementation is once again similar to its counterparts in earlier sections, so only the differences will be highlighted. \begin{fcvref}[ln:defer:route_seqlock:lookup] In -Listing~\ref{lst:defer:Sequence-Locked Pre-BSD Routing Table Lookup}, -line~\lnref{struct:re_freed} adds \co{->re_freed}, which is checked on -lines~\lnref{lookup:chk_freed} and~\lnref{lookup:abort}. -Line~\lnref{struct:sl} adds a sequence lock, which is used by \co{route_lookup()} +\cref{lst:defer:Sequence-Locked Pre-BSD Routing Table Lookup}, +\clnref{struct:re_freed} adds \co{->re_freed}, which is checked on +\clnref{lookup:chk_freed,lookup:abort}. +\Clnref{struct:sl} adds a sequence lock, which is used by \co{route_lookup()} \end{fcvref} \begin{fcvref}[ln:defer:route_seqlock:lookup:lookup] -on lines~\lnref{r_sqbegin}, \lnref{r_sqretry1}, and~\lnref{r_sqretry2}, -with lines~\lnref{goto_retry1} and~\lnref{goto_retry2} branching back to -the \co{retry} label on line~\lnref{retry}. +on \clnref{r_sqbegin,r_sqretry1,r_sqretry2}, +with \clnref{goto_retry1,goto_retry2} branching back to +the \co{retry} label on \clnref{retry}. The effect is to retry any lookup that runs concurrently with an update. \end{fcvref} \begin{fcvref}[ln:defer:route_seqlock:add_del] In -Listing~\ref{lst:defer:Sequence-Locked Pre-BSD Routing Table Add/Delete}, -lines~\lnref{add:w_sqlock}, \lnref{add:w_squnlock}, \lnref{del:w_sqlock}, -\lnref{del:w_squnlock1}, and~\lnref{del:w_squnlock2} +\cref{lst:defer:Sequence-Locked Pre-BSD Routing Table Add/Delete}, +\clnref{add:w_sqlock,add:w_squnlock,del:w_sqlock,% +del:w_squnlock1,del:w_squnlock2} acquire and release the sequence lock, -while lines~\lnref{add:clr_freed} and~\lnref{del:set_freed} handle \co{->re_freed}. +while \clnref{add:clr_freed,del:set_freed} handle \co{->re_freed}. This implementation is therefore quite straightforward. \end{fcvref} @@ -325,7 +325,7 @@ This implementation is therefore quite straightforward. \end{figure} It also performs better on the read-only workload, as can be seen in -Figure~\ref{fig:defer:Pre-BSD Routing Table Protected by Sequence Locking}, +\cref{fig:defer:Pre-BSD Routing Table Protected by Sequence Locking}, though its performance is still far from ideal. Worse yet, it suffers use-after-free failures. The problem is that the reader might encounter a segmentation violation @@ -366,10 +366,10 @@ has a chance to warn of the concurrent update. One way to protect against this sort of problem requires use of ``type-safe memory'', which will be discussed in - Section~\ref{sec:defer:RCU Provides Type-Safe Memory}. + \cref{sec:defer:RCU Provides Type-Safe Memory}. Roughly similar solutions are possible using the hazard pointers discussed in - Section~\ref{sec:defer:Hazard Pointers}. + \cref{sec:defer:Hazard Pointers}. But in either case, you would be using some other synchronization mechanism in addition to sequence locks! }\QuickQuizEnd @@ -377,7 +377,7 @@ has a chance to warn of the concurrent update. Both the read-side and write-side critical sections of a sequence lock can be thought of as transactions, and sequence locking therefore can be thought of as a limited form of transactional memory, which -will be discussed in Section~\ref{sec:future:Transactional Memory}. +will be discussed in \cref{sec:future:Transactional Memory}. The limitations of sequence locking are: (1)~Sequence locking restricts updates and (2)~sequence locking does not permit traversal of pointers to objects that might be freed by updaters. -- 2.17.1