On Jun 15, 2022, at 1:56 PM, Peter Xu <peterx@xxxxxxxxxx> wrote: > On Wed, Jun 15, 2022 at 12:42:13PM -0700, Nadav Amit wrote: >>>> (3) also has the downside of stack-protector that would be added due to >>>> stack-protector strong, which is not-that-bad, but I hate it. >>> >>> Any short (but further) explanations? >> >> Sure, but it might not be short. >> >> So one time Mathew Wilcox tried to convert some function arguments into a >> struct that would hold the arguments and transfer it as a single argument, >> and he found out that the binary size actually increased. Since I like to >> waste my time, I analyzed why. >> >> IIRC, there were two main reasons. >> >> The first one is that the kernel by default uses the “strong” >> stack-protector. It means that not all functions would have a stack >> protector, and actually only quite few would. One main reason that you have >> a stack protector is that you provide dereference a local variable. So if >> you have a local struct that hold the arguments - you get a stack protector, >> and it does introduce slight overhead (~10ns IIRC). There may be some pragma >> to prevent the stack protector, but clearly I will be shot if I used it. >> >> The second reason is that the compiler either reloads data from the struct >> you use to hold the arguments or might spill it to the stack if you try to >> cache it. >> >> Consider the following two scenarios: >> >> A. You access an argument multiple times: >> >> local1 = args->arg1; >> another_fn(); // Or some store to the heap >> local2 = args->arg1; >> >> // You use local1 and local2 >> >> In this case the compiler would reload args->arg1 from memory, even if there >> is a register that holds the value. The compiler is concerned that another_fn() >> might have overwritten args->arg1 or - in the case of a store - that the value >> was overwritten. The reload might prevent further optimizations of the compiler. >> >> B. You cache the argument locally (aka, you decided to be “smart”): >> >> arg1 = args->arg1; >> local1 = arg1; >> another_fn(); >> local2 = arg1; >> >> You may think that this prevents the reload. But this might even be worse. >> The compiler might run out of registers spill arg1 to the stack and then >> access it from the stack. Or it might need to spill something else, or >> shuffle registers around. >> >> So what can you do? You can mark another_fn() as pure if it is so, which >> does help in very certain cases. There are various limitations though. IIRC, >> gcc (or is it clang?) ignores it for inline functions. So if you have an >> inline function which does some write that you don’t care about you cannot >> force the compiler to ignore it. >> >> Note that at least gcc (IIRC) regards inline assembly as something that might >> write to arbitrary memory address. So having BUG_ON() would require a reload >> of the argument from the struct. > > Ah, I never knew that side of BUG_ON().. I was wrong about this part. Actually BUG_ON() does not have this effect. Sorry for misleading you.