On Mon, 12 Jan 2009, Bernd Schmidt wrote: > > Something at the back of my mind said "aliasing". > > $ gcc linus.c -O2 -S ; grep subl linus.s > subl $1624, %esp > $ gcc linus.c -O2 -S -fno-strict-aliasing; grep subl linus.s > subl $824, %esp > > That's with 4.3.2. Interesting. Nonsensical, but interesting. Since they have no overlap in lifetime, confusing this with aliasing is really really broken (if the functions _hadn't_ been inlined, you'd have gotten the same address for the two variables anyway! So anybody who thinks that they need different addresses because they are different types is really really fundmantally confused!). But your numbers are unambiguous, and I can see the effect of that compiler flag myself. The good news is that the kernel obviously already uses -fno-strict-aliasing for other reasonds, so we should see this effect already, _despite_ it making no sense. And the stack usage still causes problems. Oh, and I see why. This test-case shows it clearly. Note how the max stack usage _should_ be "struct b" + "struct c". Note how it isn't (it's "struct a" + "struct b/c"). So what seems to be going on is that gcc is able to do some per-slot sharing, but if you have one function with a single large entity, and another with a couple of different ones, gcc can't do any smart allocation. Put another way: gcc doesn't create a "union of the set of different stack usages" (which would be optimal given a single frame, and generate the stack layout of just the maximum possible size), it creates a "set of unions of different stack usages" (which can be optimal in the trivial cases, but not nearly optimal in practical cases). That explains the ioctl behavior - the structure use is usually pretty complicated (ie it's almost never about just _one_ large stack slot, but the ioctl cases tend to do random stuff with multiple slots). So it doesn't add up to some horrible maximum of all sizes, but it also doesn't end up coalescing stack usage very well. Linus --- struct a { int a; unsigned long array[200]; }; struct b { int b; unsigned long array[100]; }; struct c { int c; unsigned long array[100]; }; extern int fn3(int, void *); extern int fn4(int, void *); static inline __attribute__ ((always_inline)) int fn1(int flag) { struct a a; return fn3(flag, &a); } static inline __attribute__ ((always_inline)) int fn2(int flag) { struct b b; struct c c; return fn4(flag, &b) + fn4(flag, &c); } int fn(int flag) { fn1(flag); if (flag & 1) return 0; return fn2(flag); } -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html