Le 14/02/2022 à 15:00, David Laight a écrit : > From: Christophe Leroy >> Sent: 14 February 2022 13:21 >> >> Le 14/02/2022 à 12:31, David Laight a écrit : >>> From: Anshuman Khandual >>>> Sent: 14 February 2022 09:54 >>> ... >>>>> With -Winline, GCC tells: >>>>> >>>>> /include/linux/thread_info.h:212:20: warning: inlining failed in call to 'copy_overflow': call >>>> is unlikely and code size would grow [-Winline] >>>>> >>>>> copy_overflow() is a non conditional warning called by >>>>> check_copy_size() on an error path. >>>>> >>>>> check_copy_size() have to remain inlined in order to benefit >>>>> from constant folding, but copy_overflow() is not worth inlining. >>>>> >>>>> Uninline the warning when CONFIG_BUG is selected. >>>>> >>>>> When CONFIG_BUG is not selected, WARN() does nothing so skip it. >>>>> >>>>> This reduces the size of vmlinux by almost 4kbytes. >>>> >>> >>>>> +void __copy_overflow(int size, unsigned long count); >>>>> + >>>>> static inline void copy_overflow(int size, unsigned long count) >>>>> { >>>>> - WARN(1, "Buffer overflow detected (%d < %lu)!\n", size, count); >>>>> + if (IS_ENABLED(CONFIG_BUG)) >>>>> + __copy_overflow(size, count); >>>>> } >>> >>>> Just wondering, is this the only such scenario which results in >>>> an avoidable bloated vmlinux image ? >>> >>> The more interesting question is whether the call to __copy_overflow() >>> is actually significantly smaller than the one to WARN()? >>> And if so why. >>> >> unsigned long tst_copy_to_user(void __user *to, unsigned long n) >> { >> return copy_to_user(to, &jiffies_64, n); >> } >> >> With the patch: >> >> 00003c78 <tst_copy_to_user>: >> 3c78: 28 04 00 08 cmplwi r4,8 >> 3c7c: 7c 85 23 78 mr r5,r4 >> 3c80: 41 81 00 10 bgt 3c90 <tst_copy_to_user+0x18> >> 3c84: 3c 80 00 00 lis r4,0 >> 3c86: R_PPC_ADDR16_HA jiffies_64 >> 3c88: 38 84 00 00 addi r4,r4,0 >> 3c8a: R_PPC_ADDR16_LO jiffies_64 >> 3c8c: 48 00 00 00 b 3c8c <tst_copy_to_user+0x14> >> 3c8c: R_PPC_REL24 _copy_to_user >> >> 3c90: 94 21 ff f0 stwu r1,-16(r1) >> 3c94: 7c 08 02 a6 mflr r0 >> 3c98: 38 60 00 08 li r3,8 >> 3c9c: 90 01 00 14 stw r0,20(r1) >> 3ca0: 90 81 00 08 stw r4,8(r1) >> 3ca4: 48 00 00 01 bl 3ca4 <tst_copy_to_user+0x2c> >> 3ca4: R_PPC_REL24 __copy_overflow >> 3ca8: 80 a1 00 08 lwz r5,8(r1) >> 3cac: 80 01 00 14 lwz r0,20(r1) >> 3cb0: 7c a3 2b 78 mr r3,r5 >> 3cb4: 7c 08 03 a6 mtlr r0 >> 3cb8: 38 21 00 10 addi r1,r1,16 >> 3cbc: 4e 80 00 20 blr >> >> >> Without the patch: >> >> 00003c88 <tst_copy_to_user>: >> 3c88: 28 04 00 08 cmplwi r4,8 >> 3c8c: 7c 85 23 78 mr r5,r4 >> 3c90: 41 81 00 10 bgt 3ca0 <tst_copy_to_user+0x18> >> 3c94: 3c 80 00 00 lis r4,0 >> 3c96: R_PPC_ADDR16_HA jiffies_64 >> 3c98: 38 84 00 00 addi r4,r4,0 >> 3c9a: R_PPC_ADDR16_LO jiffies_64 >> 3c9c: 48 00 00 00 b 3c9c <tst_copy_to_user+0x14> >> 3c9c: R_PPC_REL24 _copy_to_user >> >> 3ca0: 94 21 ff f0 stwu r1,-16(r1) >> 3ca4: 3c 60 00 00 lis r3,0 >> 3ca6: R_PPC_ADDR16_HA .rodata.str1.4+0x30 >> 3ca8: 90 81 00 08 stw r4,8(r1) >> 3cac: 7c 08 02 a6 mflr r0 >> 3cb0: 38 63 00 00 addi r3,r3,0 >> 3cb2: R_PPC_ADDR16_LO .rodata.str1.4+0x30 >> 3cb4: 38 80 00 08 li r4,8 >> 3cb8: 90 01 00 14 stw r0,20(r1) >> 3cbc: 48 00 00 01 bl 3cbc <tst_copy_to_user+0x34> >> 3cbc: R_PPC_REL24 __warn_printk >> 3cc0: 80 a1 00 08 lwz r5,8(r1) >> 3cc4: 0f e0 00 00 twui r0,0 >> 3cc8: 80 01 00 14 lwz r0,20(r1) >> 3ccc: 7c a3 2b 78 mr r3,r5 >> 3cd0: 7c 08 03 a6 mtlr r0 >> 3cd4: 38 21 00 10 addi r1,r1,16 >> 3cd8: 4e 80 00 20 blr > > I make that 3 extra instructions. > Two are needed to load the format string. > Not sure why the third gets added. Third instruction is 'twui', to 'trap' and get the warning oops. > > Not really significant in the 12-15 the error call actually takes. > Although a lot of those are just generating the stack frame > in order to call the error function - and wouldn't be there in > a less trivial example. Yes, after looking once more, maybe making it __always_inline would be enough. The starting point was that I got almost 50 times copy_overflow() in my vmlinux, each having its own format string as well. So my patch reduced vmlinux size by 3908 bytes. But with __always_inline I get a reduction by 3560 which is almost the same. So if you prefer, I can just make copy_overflow() __always_inline and voila. > > More interesting would be changing copy_overflow() to return the size. > So copy_to_user() becomes: > > if (size_valid()) > return _copy_to_user(); > return copy_overflow() Yes that's something to try, allthough it means changing all callers of check_copy_size() Christophe