Hi John, On 2024-10-24, "John B. Wyatt IV" <jwyatt@xxxxxxxxxx> wrote: > + /** > + * The pthread_barrier_wait should guarantee that only one > + * thread at a time interacts with the variables below that > + * if block. The pthread_barrier_wait() does exactly the opposite. It guarantees that multiple threads continue at the same time. Taking a look at some users of @finish_barrier, we see this means that low_priority() will read p->total while high_priority() is performing the non-atomic operation p->total++. > + * > + * GCC -O2 rearranges the two increments above the wait > + * function calls causing a race issue if you run this > + * near full cores with one core (2 threads) free for > + * housekeeping. This causes a crash at around 2 hour of > + * running. You can prove this by commenting out the barrier > + * and compiling with `-O0`. The crash does not show with > + * -O0. Turning off optimization is not a proof. Show us the assembly code. > + * > + * Add a memory barrier to force GCC to increment the variables > + * below the pthread calls. This funcion depends on C11. > + **/ What you are talking about are compiler barriers, which are for forcing the compiler to obey instruction ordering. Memory barriers are for guarenteed memory ordering for CPUs at _runtime_. > + atomic_thread_fence(memory_order_seq_cst); A single memory barrier makes no sense. Memory barriers must be paired because you are ordering memory access between multiple CPUs. > + > /* update the group stats */ > p->total++; If you want multiple tasks to be able to modify and read these variables, then either use atomic operations or use locking. That is what such mechanisms are for. John Ogness