From: Christophe Leroy > Sent: 14 February 2022 13:21 > > Le 14/02/2022 à 12:31, David Laight a écrit : > > From: Anshuman Khandual > >> Sent: 14 February 2022 09:54 > > ... > >>> With -Winline, GCC tells: > >>> > >>> /include/linux/thread_info.h:212:20: warning: inlining failed in call to 'copy_overflow': call > >> is unlikely and code size would grow [-Winline] > >>> > >>> copy_overflow() is a non conditional warning called by > >>> check_copy_size() on an error path. > >>> > >>> check_copy_size() have to remain inlined in order to benefit > >>> from constant folding, but copy_overflow() is not worth inlining. > >>> > >>> Uninline the warning when CONFIG_BUG is selected. > >>> > >>> When CONFIG_BUG is not selected, WARN() does nothing so skip it. > >>> > >>> This reduces the size of vmlinux by almost 4kbytes. > >> > > > >>> +void __copy_overflow(int size, unsigned long count); > >>> + > >>> static inline void copy_overflow(int size, unsigned long count) > >>> { > >>> - WARN(1, "Buffer overflow detected (%d < %lu)!\n", size, count); > >>> + if (IS_ENABLED(CONFIG_BUG)) > >>> + __copy_overflow(size, count); > >>> } > > > >> Just wondering, is this the only such scenario which results in > >> an avoidable bloated vmlinux image ? > > > > The more interesting question is whether the call to __copy_overflow() > > is actually significantly smaller than the one to WARN()? > > And if so why. > > > unsigned long tst_copy_to_user(void __user *to, unsigned long n) > { > return copy_to_user(to, &jiffies_64, n); > } > > With the patch: > > 00003c78 <tst_copy_to_user>: > 3c78: 28 04 00 08 cmplwi r4,8 > 3c7c: 7c 85 23 78 mr r5,r4 > 3c80: 41 81 00 10 bgt 3c90 <tst_copy_to_user+0x18> > 3c84: 3c 80 00 00 lis r4,0 > 3c86: R_PPC_ADDR16_HA jiffies_64 > 3c88: 38 84 00 00 addi r4,r4,0 > 3c8a: R_PPC_ADDR16_LO jiffies_64 > 3c8c: 48 00 00 00 b 3c8c <tst_copy_to_user+0x14> > 3c8c: R_PPC_REL24 _copy_to_user > > 3c90: 94 21 ff f0 stwu r1,-16(r1) > 3c94: 7c 08 02 a6 mflr r0 > 3c98: 38 60 00 08 li r3,8 > 3c9c: 90 01 00 14 stw r0,20(r1) > 3ca0: 90 81 00 08 stw r4,8(r1) > 3ca4: 48 00 00 01 bl 3ca4 <tst_copy_to_user+0x2c> > 3ca4: R_PPC_REL24 __copy_overflow > 3ca8: 80 a1 00 08 lwz r5,8(r1) > 3cac: 80 01 00 14 lwz r0,20(r1) > 3cb0: 7c a3 2b 78 mr r3,r5 > 3cb4: 7c 08 03 a6 mtlr r0 > 3cb8: 38 21 00 10 addi r1,r1,16 > 3cbc: 4e 80 00 20 blr > > > Without the patch: > > 00003c88 <tst_copy_to_user>: > 3c88: 28 04 00 08 cmplwi r4,8 > 3c8c: 7c 85 23 78 mr r5,r4 > 3c90: 41 81 00 10 bgt 3ca0 <tst_copy_to_user+0x18> > 3c94: 3c 80 00 00 lis r4,0 > 3c96: R_PPC_ADDR16_HA jiffies_64 > 3c98: 38 84 00 00 addi r4,r4,0 > 3c9a: R_PPC_ADDR16_LO jiffies_64 > 3c9c: 48 00 00 00 b 3c9c <tst_copy_to_user+0x14> > 3c9c: R_PPC_REL24 _copy_to_user > > 3ca0: 94 21 ff f0 stwu r1,-16(r1) > 3ca4: 3c 60 00 00 lis r3,0 > 3ca6: R_PPC_ADDR16_HA .rodata.str1.4+0x30 > 3ca8: 90 81 00 08 stw r4,8(r1) > 3cac: 7c 08 02 a6 mflr r0 > 3cb0: 38 63 00 00 addi r3,r3,0 > 3cb2: R_PPC_ADDR16_LO .rodata.str1.4+0x30 > 3cb4: 38 80 00 08 li r4,8 > 3cb8: 90 01 00 14 stw r0,20(r1) > 3cbc: 48 00 00 01 bl 3cbc <tst_copy_to_user+0x34> > 3cbc: R_PPC_REL24 __warn_printk > 3cc0: 80 a1 00 08 lwz r5,8(r1) > 3cc4: 0f e0 00 00 twui r0,0 > 3cc8: 80 01 00 14 lwz r0,20(r1) > 3ccc: 7c a3 2b 78 mr r3,r5 > 3cd0: 7c 08 03 a6 mtlr r0 > 3cd4: 38 21 00 10 addi r1,r1,16 > 3cd8: 4e 80 00 20 blr I make that 3 extra instructions. Two are needed to load the format string. Not sure why the third gets added. Not really significant in the 12-15 the error call actually takes. Although a lot of those are just generating the stack frame in order to call the error function - and wouldn't be there in a less trivial example. More interesting would be changing copy_overflow() to return the size. So copy_to_user() becomes: if (size_valid()) return _copy_to_user(); return copy_overflow() In your example that would generate a tail call in the error path. It also avoids having to save the transfer length. Plausibly you'll get smaller code by making the prototypes of _copy_to_to_user() and copy_overflow() match. But compilers don't like generating the: (cond ? a : b)(args) assembler that would really be needed. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)