On Tue, 13 Mar 2018, Thomas Lindroth wrote: > I'm guessing you meant these: > $(wildcard include/config/ksym/pci/user/read/config/byte.h) \ > $(wildcard include/config/ksym/pci/user/read/config/word.h) \ > $(wildcard include/config/ksym/pci/user/read/config/dword.h) \ > $(wildcard include/config/ksym/pci/user/write/config/byte.h) \ > $(wildcard include/config/ksym/pci/user/write/config/word.h) \ > $(wildcard include/config/ksym/pci/user/write/config/dword.h) \ > > They exist but most of the corresponding files like > include/config/ksym/pci/user/read/config/byte.h do not. > I guess that's normal since vfio is built into the kernel and no other > module use those exports. They should get trimmed and include/config/ksym > entries don't exist for trimmed symbols? They could exist if those symbols neededto be exported in a previous build. In fact, what those files do is only holding a timestamp. When scripts/adjust_autoksyms.sh determines that some symbol changes state (newly needed or no longer needed) then the corresponding file is touched to update its timestamp to trigger the rebuild of affected source files whose dependency list contains that timestamp files. What those .o.cmd files contain is the actual dependency list. > I don't have the build directory for that failure any more but I ran some more builds > until I got a build failure like this: > ERROR: "clear_user" [drivers/media/v4l2-core/videodev.ko] undefined! > ERROR: "__clear_user" [arch/x86/kvm/kvm.ko] undefined! > > Those missing exports are part of arch/x86/lib/usercopy_64.c and the relevant ksym parts > of arch/x86/lib/.usercopy_64.o.cmd look like this: > $(wildcard include/config/ksym///clear/user.h) \ > $(wildcard include/config/ksym/clear/user.h) \ > $(wildcard include/config/ksym/arch/wb/cache/pmem.h) \ > $(wildcard include/config/ksym/memcpy/flushcache.h) \ So the dependency list is fine. > That's identical to how it looks when the build succeeds. Therefore the problem isn't there. > $ ls --full-time linux-fail/include/config/ksym///clear/user.h > -rw-r--r-- 1 cocobo cocobo 0 2018-03-13 17:16:53.000000000 +0100 linux-fail/include/config/ksym///clear/user.h > > $ ls --full-time linux-fail/vmlinux.o > -rw-r--r-- 1 cocobo cocobo 545613144 2018-03-13 17:20:57.000000000 +0100 linux-fail/vmlinux.o > > Looks fine. > > > If so then verify that include/generated/autoksyms.h has the > > corresponding defines: > > > > $ grep __put_user_ include/generated/autoksyms.h > > #define __KSYM___put_user_1 1 > > #define __KSYM___put_user_2 1 > > #define __KSYM___put_user_4 1 > > #define __KSYM___put_user_8 1 > > linux-fail/include/generated/autoksyms.h: > #define __KSYM___clear_user 1 > #define __KSYM_clear_user 1 > > That file's content is also identical to how it looks when the build succeeds. Now, what's the date of include/generated/autoksyms.h compared to arch/x86/lib/usercopy_64.o? If include/generated/autoksyms.h is older than arch/x86/lib/usercopy_64.o then the presence of __KSYM_clear_user in the former should have instanciated the corresponding EXPORT_SYMBOL() in the later. If it is the other way around then you should compare the time for arch/x86/lib/usercopy_64.o against include/config/ksym///clear/user.h. If the later is newer than the former then something failed to notice that usercopy_64.o wasn't up to date, in which case the makefile or make itself might need some investigation. If not then we'd have to look in the full build log to figure out if __KSYM_clear_user existed in the previous version of include/generated/autoksyms.h before it was refreshed by adjust_autoksyms.sh. If it was then we're back to if #1 above. If not then the timestamp for include/config/ksym///clear/user.h hasn't been updated as it should. > > Also... is the build always failing because of symbols starting with one > > or more underscores? > > The type of failures I've seen so far are: > > ERROR: "clear_user" [drivers/media/v4l2-core/videodev.ko] undefined! > ERROR: "__clear_user" [arch/x86/kvm/kvm.ko] undefined! > > ERROR: "__fill_rsb" [arch/x86/kvm/kvm-intel.ko] undefined! > > ERROR: "__put_user_2" [net/ipv4/netfilter/ip_tables.ko] undefined! > ERROR: "__put_user_2" [net/ipv4/netfilter/arp_tables.ko] undefined! > ERROR: "__put_user_8" [fs/udf/udf.ko] undefined! > ERROR: "__put_user_4" [fs/udf/udf.ko] undefined! > ERROR: "__put_user_8" [fs/fat/fat.ko] undefined! > ERROR: "__put_user_1" [fs/fat/fat.ko] undefined! > ERROR: "__put_user_4" [fs/fat/fat.ko] undefined! > ERROR: "__put_user_2" [fs/fat/fat.ko] undefined! > ERROR: "__put_user_4" [drivers/net/tap.ko] undefined! > ERROR: "__put_user_2" [drivers/net/tap.ko] undefined! > ERROR: "__put_user_8" [drivers/media/v4l2-core/videodev.ko] undefined! > ERROR: "__put_user_1" [drivers/media/v4l2-core/videodev.ko] undefined! > ERROR: "__put_user_4" [drivers/media/v4l2-core/videodev.ko] undefined! > ERROR: "__put_user_8" [drivers/input/joydev.ko] undefined! > ERROR: "__put_user_1" [drivers/input/joydev.ko] undefined! > ERROR: "__put_user_4" [drivers/input/joydev.ko] undefined! > ERROR: "__fill_rsb" [arch/x86/kvm/kvm-intel.ko] undefined! > > There might have been others but I didn't save every error message. Maybe it is just a coincidence, but there is a lot of underscore prefixed symbols in that list, except for one case. This translates to successive / in the path for the timestamp file. And that one case that doesn't fit the pattern does actually aliases a path that does. I wonder if the filesystem cache could get confused by successive / in paths here, given the non deterministic nature of the build failure you get. Could you please test with the following patch to validate this hypothesis: diff --git a/scripts/adjust_autoksyms.sh b/scripts/adjust_autoksyms.sh index 513da1a4a2..2205114add 100755 --- a/scripts/adjust_autoksyms.sh +++ b/scripts/adjust_autoksyms.sh @@ -79,6 +79,7 @@ changed=$( count=0 sort "$cur_ksyms_file" "$new_ksyms_file" | uniq -u | sed -n 's/^#define __KSYM_\(.*\) 1/\1/p' | tr "A-Z_" "a-z/" | +sed -e 's|//*|/|g' -e 's|^/||' | while read sympath; do if [ -z "$sympath" ]; then continue; fi depfile="include/config/ksym/${sympath}.h" diff --git a/scripts/basic/fixdep.c b/scripts/basic/fixdep.c index 449b68c4c9..57ae789f91 100644 --- a/scripts/basic/fixdep.c +++ b/scripts/basic/fixdep.c @@ -115,7 +115,7 @@ static void usage(void) */ static void print_config(const char *m, int slen) { - int c, i; + int c, prev_c = '/', i; printf(" $(wildcard include/config/"); for (i = 0; i < slen; i++) { @@ -124,7 +124,9 @@ static void print_config(const char *m, int slen) c = '/'; else c = tolower(c); - putchar(c); + if (c != '/' || prev_c != '/') + putchar(c); + prev_c = c; } printf(".h) \\\n"); } That would be very interesting if this patch fixed your build failures. > I patched the kernel with "kbuild: show the list of exported symbols with V=1 for TRIM_UNUSED_KSYMS" > and ran the build with V=1 until I got a failure. Here is the full output: > > success: https://gist.githubusercontent.com/anonymous/86325ce634568f1e1d31c23b5b9df295/raw/a897c77f34499a990505057f0221b4a935db4a16/gistfile1.txt > fail: https://gist.githubusercontent.com/anonymous/279a7925f1a8cd05dcad78dca0720688/raw/9e407b262fdac284b15ae07f2d7a7824051d56cc/gistfile1.txt > > The error was: > ERROR: "clear_user" [drivers/media/v4l2-core/videodev.ko] undefined! > ERROR: "__clear_user" [arch/x86/kvm/kvm.ko] undefined! > > and the relevant part of the diff of the build: > > --- kernel_build_debug_patch_success 2018-03-13 14:47:24.000000000 +0100 > +++ kernel_build_debug_patch_fail 2018-03-13 15:12:33.000000000 +0100 > @@ -45298,13 +45298,6 @@ > KSYM___x86_indirect_thunk_r15 > KSYM___fill_rsb > KSYM___clear_rsb > - gcc -Wp,-MD,arch/x86/lib/.usercopy_64.o.d -nostdinc -isystem /usr/lib/gcc/x86_64-pc-linux-gnu/6.4.0/include -I./arch/x86/include -I./arch/x86/include/generated -I./include -I./arch/x86/include/uapi -I./arch/x86/include/generated/uapi -I./include/uapi -I./include/generated/uapi -include ./include/linux/kconfig.h -D__KERNEL__ -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -fshort-wchar -Werror-implicit-function-declaration -Wno-format-security -std=gnu89 -fno-PIE -mno-sse -mno-mmx -mno-sse2 -mno-3dnow -mno-avx -m64 -falign-jumps=1 -falign-loops=1 -mno-80387 -mno-fp-ret-in-387 -mpreferred-stack-boundary=3 -mskip-rax-setup -march=core2 -mno-red-zone -mcmodel=kernel -funit-at-a-time -DCONFIG_AS_CFI=1 -DCONFIG_AS_CFI_SIGNAL_FRAME=1 -DCONFIG_AS_CFI_SECTIONS=1 -DCONFIG_AS_FXSAVEQ=1 -DCONFIG_AS_SSSE3=1 -DCONFIG_AS_CRC32=1 -DCONFIG_AS_AVX=1 -DCONFIG_AS_AVX2=1 -DCONFIG_AS_AVX512=1 -DCONFIG_AS_SHA1_NI=1 -DCONFIG_AS_SHA256_NI=1 -pipe -Wno-sign-compare -fno-asynchronous-unwind-tables -fno-delete-null-pointer-checks -Wno-frame-address -O2 --param=allow-store-data-races=0 -DCC_HAVE_ASM_GOTO -fplugin=./scripts/gcc-plugins/latent_entropy_plugin.so -fplugin=./scripts/gcc-plugins/structleak_plugin.so -DLATENT_ENTROPY_PLUGIN -fplugin-arg-structleak_plugin-byref-all -DSTRUCTLEAK_PLUGIN -Wframe-larger-than=1024 -fstack-protector-strong -Wno-unused-but-set-variable -Wno-unused-const-variable -fno-var-tracking-assignments -g -pg -mfentry -DCC_USING_FENTRY -fno-inline-functions-called-once -Wdeclaration-after-statement -Wno-pointer-sign -fno-strict-overflow -fno-stack-check -fconserve-stack -Werror=implicit-int -Werror=strict-prototypes -Werror=date-time -Werror=incompatible-pointer-types -Werror=designated-init -DKBUILD_BASENAME='"usercopy_64"' -DKBUILD_MODNAME='"usercopy_64"' -c -o arch/x86/lib/usercopy_64.o arch/x86/lib/usercopy_64.c > -KSYM___clear_user > -KSYM_clear_user > -KSYM_arch_wb_cache_pmem > -KSYM_memcpy_flushcache > - ./tools/objtool/objtool orc generate --no-fp "arch/x86/lib/usercopy_64.o"; > - if [ "-pg" = "-pg" ]; then if [ arch/x86/lib/usercopy_64.o != "scripts/mod/empty.o" ]; then ./scripts/recordmcount "arch/x86/lib/usercopy_64.o"; fi; fi; > rm -f arch/x86/lib/lib.a; ar rcsTPD arch/x86/lib/lib.a arch/x86/lib/clear_page_64.o arch/x86/lib/cmdline.o arch/x86/lib/cmpxchg16b_emu.o arch/x86/lib/copy_page_64.o arch/x86/lib/copy_user_64.o arch/x86/lib/cpu.o arch/x86/lib/csum-copy_64.o arch/x86/lib/csum-partial_64.o arch/x86/lib/csum-wrappers_64.o arch/x86/lib/delay.o arch/x86/lib/getuser.o arch/x86/lib/inat.o arch/x86/lib/insn.o arch/x86/lib/kaslr.o arch/x86/lib/memcpy_64.o arch/x86/lib/memmove_64.o arch/x86/lib/memset_64.o arch/x86/lib/misc.o arch/x86/lib/putuser.o arch/x86/lib/retpoline.o arch/x86/lib/rwsem.o arch/x86/lib/usercopy.o arch/x86/lib/usercopy_64.o > objdump -h arch/x86/lib/lib.a | sed -ne '/___ksymtab/s/.*+\([^ ]*\).*/EXTERN(\1)/p' >arch/x86/lib/.lib-ksyms.o.lds; rm -f arch/x86/lib/.lib_exports.o; echo | gcc -Wp,-MD,arch/x86/lib/.lib-ksyms.o.d -nostdinc -isystem /usr/lib/gcc/x86_64-pc-linux-gnu/6.4.0/include -I./arch/x86/include -I./arch/x86/include/generated -I./include -I./arch/x86/include/uapi -I./arch/x86/include/generated/uapi -I./include/uapi -I./include/generated/uapi -include ./include/linux/kconfig.h -D__KERNEL__ -D__ASSEMBLY__ -fno-PIE -m64 -DCONFIG_AS_CFI=1 -DCONFIG_AS_CFI_SIGNAL_FRAME=1 -DCONFIG_AS_CFI_SECTIONS=1 -DCONFIG_AS_FXSAVEQ=1 -DCONFIG_AS_SSSE3=1 -DCONFIG_AS_CRC32=1 -DCONFIG_AS_AVX=1 -DCONFIG_AS_AVX2=1 -DCONFIG_AS_AVX512=1 -DCONFIG_AS_SHA1_NI=1 -DCONFIG_AS_SHA256_NI=1 -DCC_HAVE_ASM_GOTO -Wa,-gdwarf-2 -mfentry -DCC_USING_FENTRY -c -o arch/x86/lib/.lib_exports.o -x assembler -; ld -m elf_x86_64 -r -o arch/x86/lib/lib-ksyms.o -T arch/x86/lib/.lib-ksyms.o.lds arch/x86/lib/.lib_exports.o; rm arch/x86/lib/.lib_exports.o arch/x86/lib/.lib-ksyms.o.lds > rm -f arch/x86/lib/built-in.o; ar rcSTPD arch/x86/lib/built-in.o arch/x86/lib/msr-smp.o arch/x86/lib/cache-smp.o arch/x86/lib/msr.o arch/x86/lib/msr-reg.o arch/x86/lib/msr-reg-export.o arch/x86/lib/hweight.o arch/x86/lib/iomap_copy_64.o arch/x86/lib/lib-ksyms.o ; scripts/mod/modpost arch/x86/lib/built-in.o That's it: usercopy_64.o not being rebuilt as it should, despite this on line 25369 of your log: ++ touch include/config/ksym///clear/user.h Or this on line 28057: ++ touch include/config/ksym/clear/user.h But since all the other failure cases implied symbols with "__" prefixes, I bet that the "clear_user" shows up only because it aliases "__clear_user". Nicolas