On Wed, Jun 15, 2022 at 12:42:13PM -0700, Nadav Amit wrote: > >> (3) also has the downside of stack-protector that would be added due to > >> stack-protector strong, which is not-that-bad, but I hate it. > > > > Any short (but further) explanations? > > Sure, but it might not be short. > > So one time Mathew Wilcox tried to convert some function arguments into a > struct that would hold the arguments and transfer it as a single argument, > and he found out that the binary size actually increased. Since I like to > waste my time, I analyzed why. > > IIRC, there were two main reasons. > > The first one is that the kernel by default uses the “strong” > stack-protector. It means that not all functions would have a stack > protector, and actually only quite few would. One main reason that you have > a stack protector is that you provide dereference a local variable. So if > you have a local struct that hold the arguments - you get a stack protector, > and it does introduce slight overhead (~10ns IIRC). There may be some pragma > to prevent the stack protector, but clearly I will be shot if I used it. > > The second reason is that the compiler either reloads data from the struct > you use to hold the arguments or might spill it to the stack if you try to > cache it. > > Consider the following two scenarios: > > A. You access an argument multiple times: > > local1 = args->arg1; > another_fn(); // Or some store to the heap > local2 = args->arg1; > > // You use local1 and local2 > > In this case the compiler would reload args->arg1 from memory, even if there > is a register that holds the value. The compiler is concerned that another_fn() > might have overwritten args->arg1 or - in the case of a store - that the value > was overwritten. The reload might prevent further optimizations of the compiler. > > B. You cache the argument locally (aka, you decided to be “smart”): > > arg1 = args->arg1; > local1 = arg1; > another_fn(); > local2 = arg1; > > You may think that this prevents the reload. But this might even be worse. > The compiler might run out of registers spill arg1 to the stack and then > access it from the stack. Or it might need to spill something else, or > shuffle registers around. > > So what can you do? You can mark another_fn() as pure if it is so, which > does help in very certain cases. There are various limitations though. IIRC, > gcc (or is it clang?) ignores it for inline functions. So if you have an > inline function which does some write that you don’t care about you cannot > force the compiler to ignore it. > > Note that at least gcc (IIRC) regards inline assembly as something that might > write to arbitrary memory address. So having BUG_ON() would require a reload > of the argument from the struct. Ah, I never knew that side of BUG_ON().. > > You may say: “what about the restrict keyword?” Well, this is nice in > theory, but it does not always help and the implementation of gcc and clang > are not even the same in this regard. IIRC, in one of them (I forgot which), > the restrict keyword will prevent the reload if instead another_fn() there > was a store, but once you call a different function, that compiler says “all > bets are off, I ignore the restrict” and would reload/spill arg1 (depending > on A or B). Thanks, I understand now. I did a check-up and my most-used distro (fedora) doesn't really have the strong stack protector enabled.. centos/rhel has. It seems to me besides the two points you mentioned above to increase the obj being generated (and also the performance impact), afaiu it also shrinks the number of vars to push/pop, so even if the lump-sum would be similar to before there's still a valid reason to have it since it provides more safety on stack checks. But I really am not an expert on this so I'd not be able to make any appropriate conclusion or provide any useful input.. It's just that it'll not be a question to answer to uffd code but a more general question, as using a struct pointer to pass over things are really common, afaict, in not only mm but the Linux repository as a whole. -- Peter Xu