On Thu, Nov 7, 2019 at 10:04 AM Arnd Bergmann <arnd@xxxxxxxx> wrote: > > On Thu, Nov 7, 2019 at 7:08 AM Sergey Senozhatsky > <sergey.senozhatsky.work@xxxxxxxxx> wrote: > > > > On (19/11/06 12:43), Alexander Potapenko wrote: > > > > On (19/10/30 15:22), glider@xxxxxxxxxx wrote: > > > > > @@ -163,6 +163,11 @@ int stackdepot_memcmp(const unsigned long *u1, const unsigned long *u2, > > > > > unsigned int n) > > > > > { > > > > > for ( ; n-- ; u1++, u2++) { > > > > > + /* > > > > > + * Prevent Clang from replacing this function with a bcmp() > > > > > + * call. > > > > > + */ > > > > > + barrier(); > > > > > if (*u1 != *u2) > > > > > return 1; > > > > > } > > > > > > > > Would 'volatile' do the trick? > > > It does. I can replace the barrier with a volatile if you think that's better. > > > However this'll add a checkpatch warning, as volatiles are discouraged > > > for synchronization (although in this case there's no synchronization) > > > > Yeah, 'volatile' in this case will do what it sort of meant to do - prevent > > compiler optimizations. So, like you said, it's not a synchronization issue > > and we don't 'volatile' data structures. > > The normal way to do a volatile access would be > READ_ONCE()/WRITE_ONCE(), but that seems stronger than > the barrier() here. I'd just stick to adding a barrier. I actually like the READ_ONCE idea more, as READ_ONCE is really a documented way to prevent the compiler from merging reads, which is what we want here. I also thought that the original barrier() statement was just a compiler barrier, which didn't introduce any additional CPU instructions. Turns out I was wrong, and barrier() also serves as a memory barrier. This doesn't really matter in this case, because this memcmp function isn't performance-critical (we are preventing Clang from optimizing it, after all). Still, READ_ONCE and WRITE_ONCE might be even cheaper, as they are just relaxed memory accesses implemented using volatile. > > Do you need to do barrier() on every iteration? Does clang behave if > > you do one barrier() instead of 'n' barrier()-s? > > If it does things right, it would make that a single-byte copy plus a call > to bcmp(). I certainly wouldn't want to have an implementation that relies > on the compiler making sub-optimal decisions ;-) That's a really good point :) > Arnd -- Alexander Potapenko Software Engineer Google Germany GmbH Erika-Mann-Straße, 33 80636 München Geschäftsführer: Paul Manicle, Halimah DeLaine Prado Registergericht und -nummer: Hamburg, HRB 86891 Sitz der Gesellschaft: Hamburg