On Tue, Jun 30, 2020 at 09:47:30PM +0200, Marco Elver wrote: > On Tue, 30 Jun 2020 at 19:39, Will Deacon <will@xxxxxxxxxx> wrote: > > > > When building with LTO, there is an increased risk of the compiler > > converting an address dependency headed by a READ_ONCE() invocation > > into a control dependency and consequently allowing for harmful > > reordering by the CPU. > > > > Ensure that such transformations are harmless by overriding the generic > > READ_ONCE() definition with one that provides acquire semantics when > > building with LTO. > > > > Signed-off-by: Will Deacon <will@xxxxxxxxxx> > > --- > > arch/arm64/include/asm/rwonce.h | 63 +++++++++++++++++++++++++++++++ > > arch/arm64/kernel/vdso/Makefile | 2 +- > > arch/arm64/kernel/vdso32/Makefile | 2 +- > > 3 files changed, 65 insertions(+), 2 deletions(-) > > create mode 100644 arch/arm64/include/asm/rwonce.h > > This seems reasonable, given we can't realistically tell the compiler > about dependent loads. What (if any), is the performance impact? I > guess this also heavily depends on the actual silicon. Right, it depends both on the CPU micro-architecture and also the workload. When we ran some basic tests, the overhead wasn't greater than the benefit seen by enabling LTO, so it seems like a reasonable trade-off (given that LTO is a dependency for CFI, so it's not just about performance). Will