On Thu, Feb 22, 2024 at 5:07 AM Masahiro Yamada <masahiroy@xxxxxxxxxx> wrote: > On Thu, Feb 22, 2024 at 5:27 AM Jann Horn <jannh@xxxxxxxxxx> wrote: > > > > Currently, kallsyms builds a big assembly file (~19M with a normal > > kernel config), and then the assembler has to turn that big assembly > > file back into binary data, which takes around a second per kallsyms > > invocation. (Normally there are two kallsyms invocations per build.) > > > > It is much faster to instead directly output binary data, which can > > be imported in an assembly file using ".incbin". This is also the > > approach taken by arch/x86/boot/compressed/mkpiggy.c. > > > Yes, that is a sensible case because it just wraps the binary > without any modification. > > > > > > So this patch switches kallsyms to that approach. > > > > A complication with this is that the endianness of numbers between > > host and target might not match (for example, when cross-compiling); > > and there seems to be no kconfig symbol that tells us what endianness > > the target has. > > > > CONFIG_CPU_BIG_ENDIAN is it. > > > > You could do this: > > if is_enabled CONFIG_CPU_BIG_ENDIAN; then > kallsymopt="${kallsymopt} --big-endian" > fi > > if is_enabled CONFIG_64BIT; then > kallsymopt="${kallsymopt} --64bit" > fi Aah, nice, thanks, I searched for endianness kconfig flags but somehow missed that one. Though actually, I think further optimizations might make it necessary to directly operate on ELF files anyway, in which case it would probably be easier to keep using the ELF header... > > So pass the path to the intermediate vmlinux ELF file to the kallsyms > > tool, and let it parse the ELF header to figure out the target's > > endianness. > > > > I have verified that running kallsyms without these changes and > > kallsyms with these changes on the same input System.map results > > in identical object files. > > > > This change reduces the time for an incremental kernel rebuild > > (touch fs/ioctl.c, then re-run make) from 27.7s to 24.1s (medians > > over 16 runs each) on my machine - saving around 3.6 seconds. > > > > > This reverts bea5b74504742f1b51b815bcaf9a70bddbc49ce3 > > Somebody might struggle with debugging again, but I am not sure. > > Arnd? > > > > If the effort were "I invented a way to do kallsyms in > one pass instead of three", it would be so much more attractive. Actually, I was chatting with someone about this yesterday, and I think I have an idea on how to get rid of two link steps... I might try out some stuff and then come back with another version of this series afterwards. > I am not so sure if this grain of the optimization is exciting, > but I confirmed that a few seconds were saved for the defconfig. > > I am neutral about this. > > > > For the debugging purpose, perhaps we can add --debug option > in order to leave the possibility for > outputting the full assembly as comments. Hm, maybe... though that also involves a lot of duplicate code...