On Mon, 2019-10-28 at 11:53 +0100, Arnd Bergmann wrote: > On Mon, Oct 28, 2019 at 10:52 AM Johannes Berg > <johannes@xxxxxxxxxxxxxxxx> wrote: > > > It looks like one of the last additions pushed the stack usage over > > > the 1024 byte limit > > > for 32-bit architectures: > > > > > > net/mac80211/mlme.c:4063:6: error: stack frame size of 1032 bytes in > > > function 'ieee80211_sta_rx_queued_mgmt' [-Werror,-Wframe-larger-than=] > > > > > > struct ieee802_11_elems is fairly large, and just grew another two pointers. > > > When ieee80211_rx_mgmt_assoc_resp() and ieee80211_assoc_success() > > > are inlined into ieee80211_sta_rx_queued_mgmt(), there are three copies > > > of this structure, which is slightly too much. > > > > Hmm. I guess that means the compiler isn't smart enough to make the > > copies from the inlined sub-functions alias each other? I mean, if I > > have > > > > fn1(...) { struct ... elems1; ... } > > fn2(...) { struct ... elems2; ... } > > > > fn(...) > > { > > fn1(); > > fn2(); > > } > > > > then it could reasonably use the same stack memory for elems1 and > > elems2, at least theoretically, but you're saying it doesn't do that I > > guess? > > No, that's not the problem here (it can happen if the compiler is > unable to prove the object lifetimes are non-overlapping). > > What we have here are multiple functions that are in the same call chain: > > fn1() > { > struct ieee802_11_elems e ; > } > > fn2() > { > struct ieee802_11_elems e; > ... > fn1(); > } > > fn3() > { > struct ieee802_11_elems e; > ... > fn2(); > } > > Here, the object lifetimes actually do overlap, so the compiler cannot easily > optimize that away. Ah, yes, you're right. I didn't look closely enough, sorry. > > I don't think dynamic allocation would be nice - but we could manually > > do this by passing the elems pointer into the > > ieee80211_rx_mgmt_assoc_resp() and ieee80211_assoc_success() functions. > > Ah, so you mean we can reuse the objects manually? I think that would > be great. I could not tell if that's possible when reading the source, but > if you can show that this works, that would be an easy solution. Now that I look more closely, I'm not even sure why we parse it again in ieee80211_assoc_success() after already having done it in _rx_mgmt_assoc_resp(), they're doing exactly the same thing: ieee802_11_parse_elems(pos, len - (pos - (u8 *)mgmt), false, &elems, mgmt->bssid, assoc_data->bss->bssid); (need to track the variables a bit more closely, but ...) So I think we can even just avoid duplicate work. And for the third copy - it's in a different switch case. Do you think we could rely on the compiler being able to prove non-overlapping lifetime? Or better to just pass a pointer down to _rx_mgmt_assoc_resp()? > > Why do you say 32-bit btw, it should be *bigger* on 64-bit, but I didn't > > see this ... hmm. > > That is correct. For historic reasons, both the total amount of stack space > per thread and the warning limit on 64 bit are twice the amount that we > have on 32-bit kernels, so even though the problem is more serious on > 64-bit architectures, we do not see a warning about it because we remain > well under the warning limit. Ah, ok, thanks. johannes