From: 'Alan Stern' > Sent: 02 October 2024 15:15 > > On Wed, Oct 02, 2024 at 08:13:15AM +0000, David Laight wrote: > > From: 'Alan Stern' > > > Sent: 01 October 2024 23:57 > > > > > > On Tue, Oct 01, 2024 at 05:11:05PM +0000, David Laight wrote: > > > > From: Alan Stern > > > > > Sent: 30 September 2024 19:53 > > > > > > > > > > On Mon, Sep 30, 2024 at 07:05:06PM +0200, Jonas Oberhauser wrote: > > > > > > > > > > > > > > > > > > Am 9/30/2024 um 6:43 PM schrieb Alan Stern: > > > > > > > On Mon, Sep 30, 2024 at 01:26:53PM +0200, Jonas Oberhauser wrote: > > > > > > > > > > > > > > > > > > > > > > > > Am 9/28/2024 um 4:49 PM schrieb Alan Stern: > > > > > > > > > > > > > > > > I should also point out that it is not enough to prevent the compiler from > > > > > > > > using @a instead of @b. > > > > > > > > > > > > > > > > It must also be prevented from assigning @b=@a, which it is often allowed to > > > > > > > > do after finding @a==@b. > > > > > > > > > > > > > > Wouldn't that be a bug? > > > > > > > > > > > > That's why I said that it is often allowed to do it. In your case it > > > > > > wouldn't, but it is often possible when a and b are non-atomic & > > > > > > non-volatile (and haven't escaped, and I believe sometimes even then). > > > > > > > > > > > > It happens for example here with GCC 14.1.0 -O3: > > > > > > > > > > > > int fct_hide(void) > > > > > > { > > > > > > int *a, *b; > > > > > > > > > > > > do { > > > > > > a = READ_ONCE(p); > > > > > > asm volatile ("" : : : "memory"); > > > > > > b = READ_ONCE(p); > > > > > > } while (a != b); > > > > > > OPTIMIZER_HIDE_VAR(b); > > > > > > return *b; > > > > > > } > > > > > > > > > > > > > > > > > > > > > > > > ldr r1, [r2] > > > > > > ldr r3, [r2] > > > > > > cmp r1, r3 > > > > > > bne .L6 > > > > > > mov r3, r1 // nay... > > > > > > > > > > A totally unnecessary instruction, which accomplishes nothing other than > > > > > to waste time, space, and energy. But nonetheless, allowed -- I agree. > > > > > > > > > > The people in charge of GCC's optimizer might like to hear about this, > > > > > if they're not already aware of it... > > > > > > > > > > > ldr r0, [r3] // yay! > > > > > > bx lr > > > > > > > > > > One could argue that in this example the compiler _has_ used *a instead > > > > > of *b. However, such an argument would have more force if we had > > > > > described what we are talking about more precisely. > > > > > > > > The 'mov r3, r1' has nothing to do with 'a'. > > > > > > What do you mean by that? At this point in the program, a is the > > > variable whose value is stored in r1 and b is the variable whose value > > > is stored in r3. "mov r3, r1" copies the value from r1 into r3 and is > > > therefore equivalent to executing "b = a". (That is why I said one > > > could argue that the "return *b" statement uses the value of *a.) Thus > > > it very much does have something to do with "a". > > > > After the cmp and bne r1 and r3 have the same value. > > The compiler tracks that and will use either register later. > > That can never matter. > > The whole point of this thread is that sometimes it _does_ matter. Not > on x86, but on weakly ordered architectures where using the wrong > register will bypass a dependency and allow the CPU to speculatively > load values earlier than the programmer wants it to. > > > Remember the compiler tracks values (in pseudo/internal registers) > > not variables. > > > > > > It is a more general problem that OPTIMISER_HIDE_VAR() pretty much > > > > always ends up allocating a different internal 'register' for the > > > > output and then allocating a separate physical rehgister. > > > > > > What output are you referring to? Does OPTIMISER_HIDE_VAR() have an > > > output? If it does, the source program above ignores it, discarding any > > > returned value. > > > > Look up OPTIMISER_HIDE_VAR(x) it basically x = f(x) where f() is > > the identity operation: > > asm ("" : "+r"(x)) > > I'll bet that gcc allocates a separate internal/pseudo register > > for the result so wants to do y = f(x). > > Probably generating y = x; y = f(y); > > (The 'mov' might be after the asm, but I think that would get > > optimised away - the listing file might help.) > > > > So here the compiler has just decided to reuse the register that > > held the other of a/b for the extra temporary. > > I think you've got this backward. As mentioned above, a is originally > in r1 and b is in r3. The source says OPTIMIZER_HIDE_VAR(b), so you're > saying that gcc should be copying r3 into a separate internal/pseudo > register. But instead it's copying r1. I think I know what you are trying to do, and you just fail. Whether something can work is another matter, but that code can't ever work. Inside if (a == b) the compiler will always use the same register for references to a and b - because it knows they have the same value. Possibly something like: c = b; OPTIMISER_HIDE_VAR(c); if (a == c) { *b will ensure that there isn't a speculative load from *a. You'll get at least one register-register move - but they are safe. Otherwise you'll need to put the condition inside an asm block. David > > Alan - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)