Hi Paul, I have a few comments on chapter 5. If you agree with any of them I can try to provide a patch. 5.2.2: Not really related to counters. Is it really possible for the compiler to use any location for temporary storage, even global variables? That seems a bit excessive on the compiler's part. I have definitely seen GCC reuse stack storage, but even then only when it thought that the previous value there was out of scope (erroneously in my case, as the function behaved like setjmp()). 5.2.3: Perhaps the code should be updated to use ISO C instead of GCC? _Thread_local and inline are part of the language. Listing 5.5: There is a mix of thread_id_t from CodeSamples and pthread_create() from POSIX. One of those should be changed. 5.2.4: The wording suggests that the counting threads are not impacted by the reader. But doesn't a cache line changing from Modified to Shared incur a cost on the counter when it next comes to update the value? 5.3.2: "references only per-thread variables, and should not incur any cache misses" - except that the thread can migrate to other cores and can thus incur cache misses. 5.3.3: I think that some clarification, or a simple example, is due for explaining how a failure can occur when the count is nowhere near the global max. 5.4.2: "it is worthwhile looking for algorithms with better read-side performance" - should it not be "write-side performance"? --Elad