On 2021-09-08 12:41 a.m., Linus Torvalds wrote: > On Tue, Sep 7, 2021 at 8:52 PM Harry Wentland <harry.wentland@xxxxxxx> wrote: >> >> Attached patches fix these x86_64 ones reported by Nick: > > Hmm. > > You didn't seem to fix up the calling convention for print__xyz(), > which still take those xyz structs as pass-by-value. > > Obviously it would be good to do things incrementally, so if that > attached patch was just [1/N] I won't complain.. > You're right. I was focussed on the stack frame limit but fixed up the rest as well now and sent the series out. https://lkml.org/lkml/2021/9/8/933 >> I'm also seeing one more that might be more challenging to fix but is nearly at 1024: >> >> drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn21/display_mode_vba_21.c:3397:6: error: stack frame size of 1064 bytes in function 'dml21_ModeSupportAndSystemConfigurationFull' [-Werror,-Wframe-larger-than=] > > Oh Gods, that function is truly something else.. > > Is there some reason why it's one humongous function, with the > occasional single-line comment? > > Because it really looks to me like pretty much everywhere I see one of > those rare comments, I would go "this part should be a function of its > own", and then there would be one caller fuynction that just calls > each of those sub-functions one after the other. > Yeah, that's what I'm thinking as well. It would likely fix the stack size, even without dynamically allocating the two structs you mention below. > That would - I think - make the code easier to read, and then it would > also make it very obvious where it magically uses a lot of stack. > > My suspicion is actually "nowhere". The stack use is just hugely > spread out, and the compiler has just kept accumulating more spill > variables on the frame with no single big reason. > > Yes, I see a couple of local structures: > > Pipe myPipe; > HostVM myHostVM; > > but more than that I see several function calls that have basically 62 > arguments. And I wish I was making that number up. I'm not. That > "CalculatePrefetchSchedule()" call literally has 62 arguments. > > But *all* of the top-level loops in that function literally look like > they could - and should - be functions in their own right. Some of > them would be fairly complex even so (ie that code under the comment > > //Prefetch Check > > would be quite the big function all of its own. > > We have a coding style thing: > > Documentation/process/coding-style.rst > > that says that you should strive to have functions that are "short and > sweet" and fit on one or two screenfuls of text. > > That one function from hell is 1832 lines of code. > > It really could be improved upon. > Absolutely. The file comes with an (easy to miss) disclaimer that it is "gcc-parseable HW gospel": https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/gpu/drm/amd/display/dc/dml/dcn20/display_mode_vba_20.c?h=v5.14#n32 The display_mode_vba* stuff deals with bandwidth formulas from HW designers. At some point in the past we attempted to convert them to something more readable and elegant but would often run into difficulties getting support from the right people when things wouldn't work. Using the HW designer's code directly tends to short circuit any arguments about SW correctness. In short, I don't really like this code but it works. It helps prevent black screens and underflows on the display. We try to follow the coding-style.rst for the most part elsewhere, though there are still plenty of areas where we can improve. Harry > Linus >