Hi, On Date: Tue, 27 Dec 2022 10:29:20 -0800, Paul E. McKenney wrote: > On Tue, Dec 27, 2022 at 08:06:19AM -0800, SeongJae Park wrote: >> Add missing unbreakable spaces for 'CPUs' and 'elements'. >> >> Signed-off-by: SeongJae Park <sj38.park@xxxxxxxxx> > > Works for me, thank you! > > I have queued this, and if Akira (who tests with a much wider variety > of environments than I do) does not object, then I will push it out. > > Thanx, Paul > >> --- >> Changes from v1 >> - Fix build error by removing unbreakable space from \cref{} Reviewed-by: Akira Yokosawa <akiyks@xxxxxxxxx> Thanks, Akira >> >> datastruct/datastruct.tex | 23 +++++++++++------------ >> 1 file changed, 11 insertions(+), 12 deletions(-) >> >> diff --git a/datastruct/datastruct.tex b/datastruct/datastruct.tex >> index 99c92d9a..c095b846 100644 >> --- a/datastruct/datastruct.tex >> +++ b/datastruct/datastruct.tex >> @@ -664,7 +664,7 @@ shows the same data on a linear scale. >> This drops the global-locking trace into the x-axis, but allows the >> non-ideal performance of RCU and hazard pointers to be more readily >> discerned. >> -Both show a change in slope at 224 CPUs, and this is due to hardware >> +Both show a change in slope at 224~CPUs, and this is due to hardware >> multithreading. >> At 32 and fewer CPUs, each thread has a core to itself. >> In this regime, RCU does better than does hazard pointers because the >> @@ -672,11 +672,11 @@ latter's read-side \IXpl{memory barrier} result in dead time within the core. >> In short, RCU is better able to utilize a core from a single hardware >> thread than is hazard pointers. >> >> -This situation changes above 224 CPUs. >> +This situation changes above 224~CPUs. >> Because RCU is using more than half of each core's resources from a >> single hardware thread, RCU gains relatively little benefit from the >> second hardware thread in each core. >> -The slope of the hazard-pointers trace also decreases at 224 CPUs, but >> +The slope of the hazard-pointers trace also decreases at 224~CPUs, but >> less dramatically, >> because the second hardware thread is able to fill in the time >> that the first hardware thread is stalled due to \IXh{memory-barrier}{latency}. >> @@ -776,7 +776,7 @@ to about half again faster than that of either QSBR or RCU\@. >> Still unconvinced? >> Then look at the log-log plot in >> \cref{fig:datastruct:Read-Only RCU-Protected Hash-Table Performance For Schr\"odinger's Zoo at 448 CPUs; Varying Table Size}, >> - which shows performance for 448 CPUs as a function of the >> + which shows performance for 448~CPUs as a function of the >> hash-table size, that is, number of buckets and maximum number >> of elements. >> A hash-table of size 1,024 has 1,024~buckets and contains >> @@ -785,14 +785,13 @@ to about half again faster than that of either QSBR or RCU\@. >> Because this is a read-only benchmark, the actual occupancy is >> always equal to the average occupancy. >> >> - This figure shows near-ideal performance below about 8,000 >> - elements, that is, when the hash table comprises less than >> - 1\,MB of data. >> + This figure shows near-ideal performance below about 8,000~elements, >> + that is, when the hash table comprises less than 1\,MB of data. >> This near-ideal performance is consistent with that for the >> pre-BSD routing table shown in >> \cref{fig:defer:Pre-BSD Routing Table Protected by RCU} >> on \cpageref{fig:defer:Pre-BSD Routing Table Protected by RCU}, >> - even at 448 CPUs. >> + even at 448~CPUs. >> However, the performance drops significantly (this is a log-log >> plot) at about 8,000~elements, which is where the 1,048,576-byte >> L2 cache overflows. >> @@ -835,7 +834,7 @@ data structure represented by the pre-BSD routing table. >> >> \QuickQuiz{ >> The memory system is a serious bottleneck on this big system. >> - Why bother putting 448 CPUs on a system without giving them >> + Why bother putting 448~CPUs on a system without giving them >> enough memory bandwidth to do something useful??? >> }\QuickQuizAnswer{ >> It would indeed be a bad idea to use this large and expensive >> @@ -905,10 +904,10 @@ concurrency control to begin with. >> \Cref{fig:datastruct:Read-Side RCU-Protected Hash-Table Performance For Schroedinger's Zoo in the Presence of Updates} >> therefore shows the effect of updates on readers. >> At the extreme left-hand side of this graph, all but one of the CPUs >> -are doing lookups, while to the right all 448 CPUs are doing updates. >> +are doing lookups, while to the right all 448~CPUs are doing updates. >> For all four implementations, the number of lookups per millisecond >> decreases as the number of updating CPUs increases, of course reaching >> -zero lookups per millisecond when all 448 CPUs are updating. >> +zero lookups per millisecond when all 448~CPUs are updating. >> Both hazard pointers and RCU do well compared to per-bucket locking >> because their readers do not increase update-side lock contention. >> RCU does well relative to hazard pointers as the number of updaters >> @@ -931,7 +930,7 @@ showed the effect of increasing update rates on lookups, >> \cref{fig:datastruct:Update-Side RCU-Protected Hash-Table Performance For Schroedinger's Zoo} >> shows the effect of increasing update rates on the updates themselves. >> Again, at the left-hand side of the figure all but one of the CPUs are >> -doing lookups and at the right-hand side of the figure all 448 CPUs are >> +doing lookups and at the right-hand side of the figure all 448~CPUs are >> doing updates. >> Hazard pointers and RCU start off with a significant advantage because, >> unlike bucket locking, readers do not exclude updaters. >> -- >> 2.17.1 >>