Re: [PATCH] pi_stress: Add memory barrier to resolve crash

John Ogness <john.ogness@xxxxxxxxxxxxx> · Thu, 24 Oct 2024 23:26:17 +0206

Hi John,

On 2024-10-24, "John B. Wyatt IV" <jwyatt@xxxxxxxxxx> wrote:
> +		/**
> +		 * The pthread_barrier_wait should guarantee that only one
> +		 * thread at a time interacts with the variables below that
> +		 * if block.

The pthread_barrier_wait() does exactly the opposite. It guarantees that
multiple threads continue at the same time. Taking a look at some users
of @finish_barrier, we see this means that low_priority() will read
p->total while high_priority() is performing the non-atomic operation
p->total++.

> +		 *
> +		 * GCC -O2 rearranges the two increments above the wait
> +		 * function calls causing a race issue if you run this
> +		 * near full cores with one core (2 threads) free for
> +		 * housekeeping. This causes a crash at around 2 hour of
> +		 * running. You can prove this by commenting out the barrier
> +		 * and compiling with `-O0`. The crash does not show with
> +		 * -O0.

Turning off optimization is not a proof. Show us the assembly code.

> +		 *
> +		 * Add a memory barrier to force GCC to increment the variables
> +		 * below the pthread calls. This funcion depends on C11.
> +		 **/

What you are talking about are compiler barriers, which are for forcing
the compiler to obey instruction ordering. Memory barriers are for
guarenteed memory ordering for CPUs at _runtime_.

> +		atomic_thread_fence(memory_order_seq_cst);

A single memory barrier makes no sense. Memory barriers must be paired
because you are ordering memory access between multiple CPUs.

> +
>  		/* update the group stats */
>  		p->total++;

If you want multiple tasks to be able to modify and read these
variables, then either use atomic operations or use locking. That is
what such mechanisms are for.

John Ogness