On Fri, Sep 27, 2024 at 09:37:50AM +0800, Boqun Feng wrote: > > > On Fri, Sep 27, 2024, at 9:30 AM, Mathieu Desnoyers wrote: > > On 2024-09-27 02:01, Boqun Feng wrote: > >> #define ADDRESS_EQ(var, expr) \ > >> ({ \ > >> bool _____cmp_res = (unsigned long)(var) == (unsigned long)(expr); \ > >> \ > >> OPTIMIZER_HIDE_VAR(var); \ > >> _____cmp_res; \ > >> }) > > > > If the goal is to ensure gcc uses the register populated by the > > second, I'm afraid it does not work. AFAIU, "hiding" the dependency > > chain does not prevent the SSA GVN optimization from combining the Note it's not hiding the dependency, rather the equality, > > registers as being one and choosing one arbitrary source. "hiding" after OPTIMIZER_HIDE_VAR(var), compiler doesn't know whether 'var' is equal to 'expr' anymore, because OPTIMIZER_HIDE_VAR(var) uses "=r"(var) to indicate the output is overwritten. So when 'var' is referred later, compiler cannot use the register for a 'expr' value or any other register that has the same value, because 'var' may have a different value from the compiler's POV. > > the dependency chain before or after the comparison won't help here. > > > > int fct_hide_var_compare(void) > > { > > int *a, *b; > > > > do { > > a = READ_ONCE(p); > > asm volatile ("" : : : "memory"); > > b = READ_ONCE(p); > > } while (!ADDRESS_EQ(a, b)); > > Note that ADDRESS_EQ() only hide first parameter, so this should be ADDRESS_EQ(b, a). > I replaced ADDRESS_EQ(a, b) with ADDRESS_EQ(b, a), and the compile result shows it can prevent the issue: gcc 14.2 x86-64: fct_hide_var_compare: .L2: mov rcx, QWORD PTR p[rip] mov rdx, QWORD PTR p[rip] mov rax, rdx cmp rcx, rdx jne .L2 mov eax, DWORD PTR [rax] ret gcc 14.2.0 ARM64: fct_hide_var_compare: adrp x2, p add x2, x2, :lo12:p .L2: ldr x3, [x2] ldr x1, [x2] mov x0, x1 cmp x3, x1 bne .L2 ldr w0, [x0] ret Link to godbolt: https://godbolt.org/z/a7jsfzjxY Regards, Boqun > Regards, > Boqun > > > return *b; > > } > > > > gcc 14.2 x86-64: > > > > fct_hide_var_compare: > > mov rax,QWORD PTR [rip+0x0] # 67 <fct_hide_var_compare+0x7> > > mov rdx,QWORD PTR [rip+0x0] # 6e <fct_hide_var_compare+0xe> > > cmp rax,rdx > > jne 60 <fct_hide_var_compare> > > mov eax,DWORD PTR [rax] > > ret > > main: > > xor eax,eax > > ret > > > > gcc 14.2.0 ARM64: > > > > fct_hide_var_compare: > > adrp x0, .LANCHOR0 > > add x0, x0, :lo12:.LANCHOR0 > > .L12: > > ldr x1, [x0] > > ldr x2, [x0] > > cmp x1, x2 > > bne .L12 > > ldr w0, [x1] > > ret > > p: > > .zero 8 > > > > > > -- > > Mathieu Desnoyers > > EfficiOS Inc. > > https://www.efficios.com >