On Mon, 25 Nov 2019 at 19:39, Mark Rutland <mark.rutland@xxxxxxx> wrote: > > On Mon, Nov 25, 2019 at 07:22:33PM +0100, Marco Elver wrote: > > On Mon, 25 Nov 2019 at 18:38, Mark Rutland <mark.rutland@xxxxxxx> wrote: > > > > > > On Fri, Nov 22, 2019 at 04:42:20PM +0100, Marco Elver wrote: > > > > Prefer __always_inline for atomic wrappers. When building for size > > > > (CC_OPTIMIZE_FOR_SIZE), some compilers appear to be less inclined to > > > > inline even relatively small static inline functions that are assumed to > > > > be inlinable such as atomic ops. This can cause problems, for example in > > > > UACCESS regions. > > > > > > From looking at the link below, the problem is tat objtool isn't happy > > > about non-whiteliested calls within UACCESS regions. > > > > > > Is that a problem here? are the kasan/kcsan calls whitelisted? > > > > We whitelisted all the relevant functions. > > > > The problem it that small static inline functions private to the > > compilation unit do not get inlined when CC_OPTIMIZE_FOR_SIZE=y (they > > do get inlined when CC_OPTIMIZE_FOR_PERFORMANCE=y). > > > > For the runtime this is easy to fix, by just making these small > > functions __always_inline (also avoiding these function call overheads > > in the runtime when CC_OPTIMIZE_FOR_SIZE). > > > > I stumbled upon the issue for the atomic ops, because the runtime uses > > atomic_long_try_cmpxchg outside a user_access_save() region (and it > > should not be moved inside). Essentially I fixed up the runtime, but > > then objtool still complained about the access to > > atomic64_try_cmpxchg. Hence this patch. > > > > I believe it is the right thing to do, because the final inlining > > decision should *not* be made by wrappers. I would think this patch is > > the right thing to do irrespective of KCSAN or not. > > Given the wrappers are trivial, and for !KASAN && !KCSAN, this would > make them equivalent to the things they wrap, that sounds fine to me. > > > > > By using __always_inline, we let the real implementation and not the > > > > wrapper determine the final inlining preference. > > > > > > That sounds reasonable to me, assuming that doesn't end up significantly > > > bloating the kernel text. What impact does this have on code size? > > > > It actually seems to make it smaller. > > > > x86 tinyconfig: > > - vmlinux baseline: 1316204 > > - vmlinux with patches: 1315988 (-216 bytes) > > Great! Fancy putting that in the commit message? Done. > > > > This came up when addressing UACCESS warnings with CC_OPTIMIZE_FOR_SIZE > > > > in the KCSAN runtime: > > > > http://lkml.kernel.org/r/58708908-84a0-0a81-a836-ad97e33dbb62@xxxxxxxxxxxxx > > > > > > > > Reported-by: Randy Dunlap <rdunlap@xxxxxxxxxxxxx> > > > > Signed-off-by: Marco Elver <elver@xxxxxxxxxx> > > > > --- > > > > include/asm-generic/atomic-instrumented.h | 334 +++++++++++----------- > > > > include/asm-generic/atomic-long.h | 330 ++++++++++----------- > > > > scripts/atomic/gen-atomic-instrumented.sh | 6 +- > > > > scripts/atomic/gen-atomic-long.sh | 2 +- > > > > 4 files changed, 336 insertions(+), 336 deletions(-) > > > > > > Do we need to do similar for gen-atomic-fallback.sh and the fallbacks > > > defined in scripts/atomic/fallbacks/ ? > > > > I think they should be, but I think that's debatable. Some of them do > > a little more than just wrap things. If we want to make this > > __always_inline, I would do it in a separate patch independent from > > this series to not stall the fixes here. > > I would expect that they would suffer the same problem if used in a > UACCESS region, so if that's what we're trying to fix here, I think that > we need to do likewise there. > > The majority are trivial wrappers (shuffling arguments or adding trivial > barriers), so those seem fine. The rest call things that we're inlining > here. > > Would you be able to give that a go? Done in v2. > > > > diff --git a/scripts/atomic/gen-atomic-instrumented.sh b/scripts/atomic/gen-atomic-instrumented.sh > > > > index 8b8b2a6f8d68..68532d4f36ca 100755 > > > > --- a/scripts/atomic/gen-atomic-instrumented.sh > > > > +++ b/scripts/atomic/gen-atomic-instrumented.sh > > > > @@ -84,7 +84,7 @@ gen_proto_order_variant() > > > > [ ! -z "${guard}" ] && printf "#if ${guard}\n" > > > > > > > > cat <<EOF > > > > -static inline ${ret} > > > > +static __always_inline ${ret} > > > > > > We should add an include of <linux/compiler.h> to the preamble if we're > > > explicitly using __always_inline. > > > > Will add in v2. > > > > > > diff --git a/scripts/atomic/gen-atomic-long.sh b/scripts/atomic/gen-atomic-long.sh > > > > index c240a7231b2e..4036d2dd22e9 100755 > > > > --- a/scripts/atomic/gen-atomic-long.sh > > > > +++ b/scripts/atomic/gen-atomic-long.sh > > > > @@ -46,7 +46,7 @@ gen_proto_order_variant() > > > > local retstmt="$(gen_ret_stmt "${meta}")" > > > > > > > > cat <<EOF > > > > -static inline ${ret} > > > > +static __always_inline ${ret} > > > > > > Likewise here > > > > Will add in v2. > > Great; thanks! Sent v2: http://lkml.kernel.org/r/20191126114121.85552-1-elver@xxxxxxxxxx Thanks, -- Marco