From: Linus Torvalds > Sent: 14 May 2020 03:20 > On Wed, May 13, 2020 at 5:51 PM Nick Desaulniers > <ndesaulniers@xxxxxxxxxx> wrote: > > > > Are you sure LTO treats empty asm statements differently than full > > memory barriers in regards to preventing tail calls? > > It had better. > > At link-time, there is nothing left of an empty asm statement. So by > the time the linker runs, it only sees > > call xyz > ret > > in the object code. At that point, it's somewhat reasonable for any > link-time optimizer (or an optimizing assembler, for that matter) to > say "I'll just turn that sequence into a simple 'jmp xyz' instead". Except is sees: call xyz canary_check_code ret There's also almost certainly some stack frame tidyup. Which it would have to detect and convert. And, in principle, the function is allowed to access the stack space than contains the canary. > Now, don't get me wrong - I'm not convinced any existing LTO does > that. But I'd also not be shocked by something like that. > > In contrast, if it's a real mb(), the linker won't see just a > 'call+ret" sequence. It will see something like > > call xyz > mfence > ret > > (ok, the mfence may actually be something else, and we'll have a label > on it and an alternatives table pointing to it, but the point is, > unlike an empty asm, there's something _there_). Not if you've an architecture that doesn't have any memory barriers. In that case mb() may not even be asm(""). (although it might have to be asm ("":::memory)). David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)