On 03/01/19 11:08, Joel Fernandes (Google) wrote: > Introduce in-kernel headers and other artifacts which are made available > as an archive through proc (/proc/kheaders.tar.xz file). This archive makes > it possible to build kernel modules, run eBPF programs, and other > tracing programs that need to extend the kernel for tracing purposes > without any dependency on the file system having headers and build > artifacts. > > On Android and embedded systems, it is common to switch kernels but not > have kernel headers available on the file system. Raw kernel headers > also cannot be copied into the filesystem like they can be on other > distros, due to licensing and other issues. There's no linux-headers > package on Android. Further once a different kernel is booted, any > headers stored on the file system will no longer be useful. By storing > the headers as a compressed archive within the kernel, we can avoid these > issues that have been a hindrance for a long time. > > The feature is also buildable as a module just in case the user desires > it not being part of the kernel image. This makes it possible to load > and unload the headers on demand. A tracing program, or a kernel module > builder can load the module, do its operations, and then unload the > module to save kernel memory. The total memory needed is 3.8MB. > > The code to read the headers is based on /proc/config.gz code and uses > the same technique to embed the headers. > > To build a module, the below steps have been tested on an x86 machine: > modprobe kheaders > rm -rf $HOME/headers > mkdir -p $HOME/headers > tar -xvf /proc/kheaders.tar.xz -C $HOME/headers >/dev/null > cd my-kernel-module > make -C $HOME/headers M=$(pwd) modules > rmmod kheaders > > Additional notes: > (1) external modules must be built on the same arch as the host that > built vmlinux. This can be done either in a qemu emulated chroot on the > target, or natively. This is due to host arch dependency of kernel > scripts. > > (2) > A limitation of module building with this is, since Module.symvers is > not available in the archive due to a cyclic dependency with building of > the archive into the kernel or module binaries, the modules built using > the archive will not contain symbol versioning (modversion). This is > usually not an issue since the idea of this patch is to build a kernel > module on the fly and load it into the same kernel. An appropriate > warning is already printed by the kernel to alert the user of modules > not having modversions when built using the archive. For building with > modversions, the user can use traditional header packages. For our > tracing usecases, we build modules on the fly with this so it is not a > concern. > > (3) I have left IKHD_ST and IKHD_ED markers as is to facilitate > future patches that would extract the headers from a kernel or module > image. > > Signed-off-by: Joel Fernandes (Google) <joel@xxxxxxxxxxxxxxxxx> I could get the headers using this patch in both built-in and modules options. You can add my tested-and-reviewed-by: Qais Yousef <qais.yousef@xxxxxxx> I am not familiar with running kselftests so didn't get a chance to try the next patch. Thanks -- Qais Yousef > --- > > Changes since v3: > - Blank tar was being generated because of a one line I > forgot to push. It is updated now. > - Added module.lds since arm64 needs it to build modules. > > Changes since v2: > (Thanks to Masahiro Yamada for several excellent suggestions) > - Added support for out of tree builds. > - Added incremental build support bringing down build time of > incremental builds from 50 seconds to 5 seconds. > - Fixed various small nits / cleanups. > - clean ups to kheaders.c pointed by Alexey Dobriyan. > - Fixed MODULE_LICENSE in test module and kheaders.c > - Dropped Module.symvers from archive due to circular dependency. > > Changes since v1: > - removed IKH_EXTRA variable, not needed (Masahiro Yamada) > - small fix ups to selftest > - added target to main Makefile etc > - added MODULE_LICENSE to test module > - made selftest more quiet > > Changes since RFC: > Both changes bring size down to 3.8MB: > - use xz for compression > - strip comments except SPDX lines > - Call out the module name in Kconfig > - Also added selftests in second patch to ensure headers are always > working. > > Other notes: > By the way I still see this error (without the patch) when doing a clean > build: Makefile:594: include/config/auto.conf: No such file or directory > > It appears to be because of commit 0a16d2e8cb7e ("kbuild: use 'include' > directive to load auto.conf from top Makefile") > > Documentation/dontdiff | 1 + > init/Kconfig | 11 ++++++ > kernel/.gitignore | 3 ++ > kernel/Makefile | 37 +++++++++++++++++++ > kernel/kheaders.c | 72 ++++++++++++++++++++++++++++++++++++ > scripts/gen_ikh_data.sh | 78 +++++++++++++++++++++++++++++++++++++++ > scripts/strip-comments.pl | 8 ++++ > 7 files changed, 210 insertions(+) > create mode 100644 kernel/kheaders.c > create mode 100755 scripts/gen_ikh_data.sh > create mode 100755 scripts/strip-comments.pl > > diff --git a/Documentation/dontdiff b/Documentation/dontdiff > index 2228fcc8e29f..05a2319ee2a2 100644 > --- a/Documentation/dontdiff > +++ b/Documentation/dontdiff > @@ -151,6 +151,7 @@ int8.c > kallsyms > kconfig > keywords.c > +kheaders_data.h* > ksym.c* > ksym.h* > kxgettext > diff --git a/init/Kconfig b/init/Kconfig > index c9386a365eea..63ff0990ae55 100644 > --- a/init/Kconfig > +++ b/init/Kconfig > @@ -563,6 +563,17 @@ config IKCONFIG_PROC > This option enables access to the kernel configuration file > through /proc/config.gz. > > +config IKHEADERS_PROC > + tristate "Enable kernel header artifacts through /proc/kheaders.tar.xz" > + select BUILD_BIN2C > + depends on PROC_FS > + help > + This option enables access to the kernel header and other artifacts that > + are generated during the build process. These can be used to build kernel > + modules, and other in-kernel programs such as those generated by eBPF > + and systemtap tools. If you build the headers as a module, a module > + called kheaders.ko is built which can be loaded to get access to them. > + > config LOG_BUF_SHIFT > int "Kernel log buffer size (16 => 64KB, 17 => 128KB)" > range 12 25 > diff --git a/kernel/.gitignore b/kernel/.gitignore > index b3097bde4e9c..484018945e93 100644 > --- a/kernel/.gitignore > +++ b/kernel/.gitignore > @@ -3,5 +3,8 @@ > # > config_data.h > config_data.gz > +kheaders.md5 > +kheaders_data.h > +kheaders_data.tar.xz > timeconst.h > hz.bc > diff --git a/kernel/Makefile b/kernel/Makefile > index 6aa7543bcdb2..240685a6b638 100644 > --- a/kernel/Makefile > +++ b/kernel/Makefile > @@ -70,6 +70,7 @@ obj-$(CONFIG_UTS_NS) += utsname.o > obj-$(CONFIG_USER_NS) += user_namespace.o > obj-$(CONFIG_PID_NS) += pid_namespace.o > obj-$(CONFIG_IKCONFIG) += configs.o > +obj-$(CONFIG_IKHEADERS_PROC) += kheaders.o > obj-$(CONFIG_SMP) += stop_machine.o > obj-$(CONFIG_KPROBES_SANITY_TEST) += test_kprobes.o > obj-$(CONFIG_AUDIT) += audit.o auditfilter.o > @@ -130,3 +131,39 @@ filechk_ikconfiggz = \ > targets += config_data.h > $(obj)/config_data.h: $(obj)/config_data.gz FORCE > $(call filechk,ikconfiggz) > + > +# Build a list of in-kernel headers for building kernel modules > +ikh_file_list := include/ > +ikh_file_list += arch/$(SRCARCH)/Makefile > +ikh_file_list += arch/$(SRCARCH)/include/ > +ikh_file_list += arch/$(SRCARCH)/kernel/module.lds > +ikh_file_list += scripts/ > +ikh_file_list += Makefile > + > +# Things we need from the $objtree. "OBJDIR" is for the gen_ikh_data.sh > +# script to identify that this comes from the $objtree directory > +ikh_file_list += OBJDIR/scripts/ > +ikh_file_list += OBJDIR/include/ > +ikh_file_list += OBJDIR/arch/$(SRCARCH)/include/ > +ifeq ($(CONFIG_STACK_VALIDATION), y) > +ikh_file_list += OBJDIR/tools/objtool/objtool > +endif > + > +$(obj)/kheaders.o: $(obj)/kheaders_data.h > + > +targets += kheaders_data.tar.xz > + > +quiet_cmd_genikh = GEN $(obj)/kheaders_data.tar.xz > +cmd_genikh = $(srctree)/scripts/gen_ikh_data.sh $@ $(ikh_file_list) > +$(obj)/kheaders_data.tar.xz: FORCE > + $(call cmd,genikh) > + > +filechk_ikheadersxz = \ > + echo "static const char kernel_headers_data[] __used = KH_MAGIC_START"; \ > + cat $< | scripts/bin2c; \ > + echo "KH_MAGIC_END;" > + > +targets += kheaders_data.h > +targets += kheaders.md5 > +$(obj)/kheaders_data.h: $(obj)/kheaders_data.tar.xz FORCE > + $(call filechk,ikheadersxz) > diff --git a/kernel/kheaders.c b/kernel/kheaders.c > new file mode 100644 > index 000000000000..46a6358301e5 > --- /dev/null > +++ b/kernel/kheaders.c > @@ -0,0 +1,72 @@ > +// SPDX-License-Identifier: GPL-2.0 > +/* > + * kernel/kheaders.c > + * Provide headers and artifacts needed to build kernel modules. > + * (Borrowed code from kernel/configs.c) > + */ > + > +#include <linux/kernel.h> > +#include <linux/module.h> > +#include <linux/proc_fs.h> > +#include <linux/init.h> > +#include <linux/uaccess.h> > + > +/* > + * Define kernel_headers_data and kernel_headers_data_size, which contains the > + * compressed kernel headers. The file is first compressed with xz and then > + * bounded by two eight byte magic numbers to allow extraction from a binary > + * kernel image: > + * > + * IKHD_ST > + * <image> > + * IKHD_ED > + */ > +#define KH_MAGIC_START "IKHD_ST" > +#define KH_MAGIC_END "IKHD_ED" > +#include "kheaders_data.h" > + > + > +#define KH_MAGIC_SIZE (sizeof(KH_MAGIC_START) - 1) > +#define kernel_headers_data_size \ > + (sizeof(kernel_headers_data) - 1 - KH_MAGIC_SIZE * 2) > + > +static ssize_t > +ikheaders_read_current(struct file *file, char __user *buf, > + size_t len, loff_t *offset) > +{ > + return simple_read_from_buffer(buf, len, offset, > + kernel_headers_data + KH_MAGIC_SIZE, > + kernel_headers_data_size); > +} > + > +static const struct file_operations ikheaders_file_ops = { > + .read = ikheaders_read_current, > + .llseek = default_llseek, > +}; > + > +static int __init ikheaders_init(void) > +{ > + struct proc_dir_entry *entry; > + > + /* create the current headers file */ > + entry = proc_create("kheaders.tar.xz", S_IRUGO, NULL, > + &ikheaders_file_ops); > + if (!entry) > + return -ENOMEM; > + > + proc_set_size(entry, kernel_headers_data_size); > + > + return 0; > +} > + > +static void __exit ikheaders_cleanup(void) > +{ > + remove_proc_entry("kheaders.tar.xz", NULL); > +} > + > +module_init(ikheaders_init); > +module_exit(ikheaders_cleanup); > + > +MODULE_LICENSE("GPL v2"); > +MODULE_AUTHOR("Joel Fernandes"); > +MODULE_DESCRIPTION("Echo the kernel header artifacts used to build the kernel"); > diff --git a/scripts/gen_ikh_data.sh b/scripts/gen_ikh_data.sh > new file mode 100755 > index 000000000000..1fa5628fcc30 > --- /dev/null > +++ b/scripts/gen_ikh_data.sh > @@ -0,0 +1,78 @@ > +#!/bin/bash > +# SPDX-License-Identifier: GPL-2.0 > + > +spath="$(dirname "$(readlink -f "$0")")" > +kroot="$spath/.." > +outdir="$(pwd)" > +tarfile=$1 > +cpio_dir=$outdir/$tarfile.tmp > + > +file_list=${@:2} > + > +src_file_list="" > +for f in $file_list; do > + src_file_list="$src_file_list $(echo $f | grep -v OBJDIR)" > +done > + > +obj_file_list="" > +for f in $file_list; do > + f=$(echo $f | grep OBJDIR | sed -e 's/OBJDIR\///g') > + obj_file_list="$obj_file_list $f"; > +done > + > +# Support incremental builds by skipping archive generation > +# if timestamps of files being archived are not changed. > + > +# This block is useful for debugging the incremental builds. > +# Uncomment it for debugging. > +# iter=1 > +# if [ ! -f /tmp/iter ]; then echo 1 > /tmp/iter; > +# else; iter=$(($(cat /tmp/iter) + 1)); fi > +# find $src_file_list -type f | xargs ls -lR > /tmp/src-ls-$iter > +# find $obj_file_list -type f | xargs ls -lR > /tmp/obj-ls-$iter > + > +# modules.order and include/generated/compile.h are ignored because these are > +# touched even when none of the source files changed. This causes pointless > +# regeneration, so let us ignore them for md5 calculation. > +pushd $kroot > /dev/null > +src_files_md5="$(find $src_file_list -type f ! -name modules.order | > + grep -v "include/generated/compile.h" | > + xargs ls -lR | md5sum | cut -d ' ' -f1)" > +popd > /dev/null > +obj_files_md5="$(find $obj_file_list -type f ! -name modules.order | > + grep -v "include/generated/compile.h" | > + xargs ls -lR | md5sum | cut -d ' ' -f1)" > + > +if [ -f $tarfile ]; then tarfile_md5="$(md5sum $tarfile | cut -d ' ' -f1)"; fi > +if [ -f kernel/kheaders.md5 ] && > + [ "$(cat kernel/kheaders.md5|head -1)" == "$src_files_md5" ] && > + [ "$(cat kernel/kheaders.md5|head -2|tail -1)" == "$obj_files_md5" ] && > + [ "$(cat kernel/kheaders.md5|tail -1)" == "$tarfile_md5" ]; then > + exit > +fi > + > +rm -rf $cpio_dir > +mkdir $cpio_dir > + > +pushd $kroot > /dev/null > +for f in $src_file_list; > + do find "$f" ! -name "*.c" ! -name "*.o" ! -name "*.cmd" ! -name ".*"; > +done | cpio --quiet -pd $cpio_dir > +popd > /dev/null > + > +# The second CPIO can complain if files already exist which can > +# happen with out of tree builds. Just silence CPIO for now. > +for f in $obj_file_list; > + do find "$f" ! -name "*.c" ! -name "*.o" ! -name "*.cmd" ! -name ".*"; > +done | cpio --quiet -pd $cpio_dir >/dev/null 2>&1 > + > +find $cpio_dir -type f -print0 | > + xargs -0 -P8 -n1 -I {} sh -c "$spath/strip-comments.pl {}" > + > +tar -Jcf $tarfile -C $cpio_dir/ . > /dev/null > + > +echo "$src_files_md5" > kernel/kheaders.md5 > +echo "$obj_files_md5" >> kernel/kheaders.md5 > +echo "$(md5sum $tarfile | cut -d ' ' -f1)" >> kernel/kheaders.md5 > + > +rm -rf $cpio_dir > diff --git a/scripts/strip-comments.pl b/scripts/strip-comments.pl > new file mode 100755 > index 000000000000..f8ada87c5802 > --- /dev/null > +++ b/scripts/strip-comments.pl > @@ -0,0 +1,8 @@ > +#!/usr/bin/perl -pi > +# SPDX-License-Identifier: GPL-2.0 > + > +# This script removes /**/ comments from a file, unless such comments > +# contain "SPDX". It is used when building compressed in-kernel headers. > + > +BEGIN {undef $/;} > +s/\/\*((?!SPDX).)*?\*\///smg; > -- > 2.21.0.352.gf09ad66450-goog