Am 9/29/2024 um 12:26 AM schrieb Alan Huang:
2024年9月28日 23:55,Mathieu Desnoyers <mathieu.desnoyers@xxxxxxxxxxxx> wrote:
On 2024-09-28 17:49, Alan Stern wrote:
On Sat, Sep 28, 2024 at 11:32:18AM -0400, Mathieu Desnoyers wrote:
On 2024-09-28 16:49, Alan Stern wrote:
On Sat, Sep 28, 2024 at 09:51:27AM -0400, Mathieu Desnoyers wrote:
equality, which does not preserve address dependencies and allows the
following misordering speculations:
- If @b is a constant, the compiler can issue the loads which depend
on @a before loading @a.
- If @b is a register populated by a prior load, weakly-ordered
CPUs can speculate loads which depend on @a before loading @a.
It shouldn't matter whether @a and @b are constants, registers, or
anything else. All that matters is that the compiler uses the wrong
one, which allows weakly ordered CPUs to speculate loads you wouldn't
expect it to, based on the source code alone.
I only partially agree here.
On weakly-ordered architectures, indeed we don't care whether the
issue is caused by the compiler reordering the code (constant)
or the CPU speculating the load (registers).
However, on strongly-ordered architectures, AFAIU, only the constant
case is problematic (compiler reordering the dependent load), because
I thought you were trying to prevent the compiler from using one pointer
instead of the other, not trying to prevent it from reordering anything.
Isn't this the point the documentation wants to get across when it says
that comparing pointers can be dangerous?
The motivation for introducing ptr_eq() is indeed because the
compiler barrier is not sufficient to prevent the compiler from
using one pointer instead of the other.
barrier_data(&b) prevents that.
I don't think one barrier_data can garantuee preventing this, because
right after doing the comparison, the compiler still could do b=a.
In that case you would be guaranteed to use the value in b, but that
value is not the value loaded into b originally but rather the value
loaded into a, and hence your address dependency goes to the wrong load
still.
However, doing
barrier_data(&b);
if (a == b) {
barrier();
foo(*b);
}
might maybe prevent it, because after the address of b is escaped, the
compiler might no longer be allowed to just do b=a;, but I'm not sure if
that is completely correct, since the compiler knows b==a and no other
thread can be concurrently modifying a or b. Therefore, given that the
compiler knows the hardware, it might know that assigning b=a would not
cause any race-related issues even if another thread was reading b
concurrently.
Finally, it may be only a combination of barrier_data and making b
volatile could be guaranteed to solve the issue, but the code will be
very obscure compared to using ptr_eq.
jonas