* Denys Vlasenko <dvlasenk@xxxxxxxxxx> wrote: > Can you post your .config for the test? > If you have CONFIG_OPTIMIZE_INLINING=y in your -Os test, > consider re-testing with it turned off. Yes, I had CONFIG_OPTIMIZE_INLINING=y. With that turned off, on GCC 4.9.2, I'm seeing: fomalhaut:~/linux/linux-____CC_OPTIMIZE_FOR_SIZE=y> size vmlinux.OPTIMIZE_INLINING\=* text data bss dec hex filename 12150606 2565544 1634304 16350454 f97cf6 vmlinux.OPTIMIZE_INLINING=y 12354814 2572520 1634304 16561638 fcb5e6 vmlinux.OPTIMIZE_INLINING=n I.e. forcing the inlining increases the kernel size again, by about 1.7%. I re-ran the tests on the Intel system, and got these I$ miss rates: linux-falign-functions=_64-bytes: 647,853,942 L1-icache-load-misses ( +- 0.07% ) (100.00%) linux-falign-functions=_16-bytes: 706,080,917 L1-icache-load-misses ( +- 0.05% ) (100.00%) linux-CC_OPTIMIZE_FOR_SIZE=y+OPTIMIZE_INLINING=y: 921,910,808 L1-icache-load-misses ( +- 0.05% ) (100.00%) linux-CC_OPTIMIZE_FOR_SIZE=y+OPTIMIZE_INLINING=n: 792,395,265 L1-icache-load-misses ( +- 0.05% ) (100.00%) So yeah, it got better - but the I$ cache miss rate is still 22.4% higher than that of the 64-bytes aligned kernel and 12.2% higher than the vanilla kernel. Elapsed time had this original OPTIMIZE_FOR_SIZE result: 8.531418784 seconds time elapsed ( +- 0.19% ) this now improved to: 7.686174880 seconds time elapsed ( +- 0.18% ) but it's still much worse than the 64-byte aligned one: 7.154816369 seconds time elapsed ( +- 0.03% ) and the 16-byte aligned one: 7.333597250 seconds time elapsed ( +- 0.48% ) > You may be seeing this: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66122 Yeah, disabling OPTIMIZE_INLINING made a difference - but it didn't recover the performance loss, -Os is still 4.8% slower in this workload than the vanilla kernel. Thanks, Ingo -- To unsubscribe from this list: send the line "unsubscribe linux-tip-commits" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html
![]() |