Hi Luis, On Mon, Oct 21, 2024 at 9:33 PM Luis Chamberlain <mcgrof@xxxxxxxxxx> wrote: > We lack find_symbol() selftests, so add one. This let's us stress test > improvements easily on find_symbol() or optimizations. It also inherently > allows us to test the limits of kallsyms on Linux today. > > We test a pathalogical use case for kallsyms by introducing modules > which are automatically written for us with a larger number of symbols. > We have 4 kallsyms test modules: > > A: has KALLSYSMS_NUMSYMS exported symbols > B: uses one of A's symbols > C: adds KALLSYMS_SCALE_FACTOR * KALLSYSMS_NUMSYMS exported > D: adds 2 * the symbols than C > > By using anything much larger than KALLSYSMS_NUMSYMS as 10,000 and > KALLSYMS_SCALE_FACTOR of 8 we segfault today. So we're capped at > around 160000 symbols somehow today. We can inpsect that issue at > our leasure later, but for now the real value to this test is that > this will easily allow us to test improvements on find_symbol(). > > We want to enable this test on allyesmodconfig builds so we can't > use this combination, so instead just use a safe value for now and > be informative on the Kconfig symbol documentation about where our > thresholds are for testers. We default then to KALLSYSMS_NUMSYMS of > just 100 and KALLSYMS_SCALE_FACTOR of 8. > > On x86_64 we can use perf, for other architectures we just use 'time' > and allow for customizations. For example a future enhancements could > be done for parisc to check for unaligned accesses which triggers a > special special exception handler assembler code inside the kernel. > The negative impact on performance is so large on parisc that it > keeps track of its accesses on /proc/cpuinfo as UAH: > > IRQ: CPU0 CPU1 > 3: 1332 0 SuperIO ttyS0 > 7: 1270013 0 SuperIO pata_ns87415 > 64: 320023012 320021431 CPU timer > 65: 17080507 20624423 CPU IPI > UAH: 10948640 58104 Unaligned access handler traps > > While at it, this tidies up lib/ test modules to allow us to have > a new directory for them. The amount of test modules under lib/ > is insane. > > This should also hopefully showcase how to start doing basic > self module writing code, which may be more useful for more complex > cases later in the future. > > Signed-off-by: Luis Chamberlain <mcgrof@xxxxxxxxxx> Thanks for your patch, which is now commit 84b4a51fce4ccc66 ("selftests: add new kallsyms selftests") upstream. > @@ -2903,6 +2903,111 @@ config TEST_KMOD > > If unsure, say N. > > +config TEST_RUNTIME > + bool > + > +config TEST_RUNTIME_MODULE > + bool > + > +config TEST_KALLSYMS > + tristate "module kallsyms find_symbol() test" > + depends on m > + select TEST_RUNTIME > + select TEST_RUNTIME_MODULE > + select TEST_KALLSYMS_A > + select TEST_KALLSYMS_B > + select TEST_KALLSYMS_C > + select TEST_KALLSYMS_D > + help > + This allows us to stress test find_symbol() through the kallsyms > + used to place symbols on the kernel ELF kallsyms and modules kallsyms > + where we place kernel symbols such as exported symbols. > + > + We have four test modules: > + > + A: has KALLSYSMS_NUMSYMS exported symbols > + B: uses one of A's symbols > + C: adds KALLSYMS_SCALE_FACTOR * KALLSYSMS_NUMSYMS exported > + D: adds 2 * the symbols than C > + > + We stress test find_symbol() through two means: > + > + 1) Upon load of B it will trigger simplify_symbols() to look for the > + one symbol it uses from the module A with tons of symbols. This is an > + indirect way for us to have B call resolve_symbol_wait() upon module > + load. This will eventually call find_symbol() which will eventually > + try to find the symbols used with find_exported_symbol_in_section(). > + find_exported_symbol_in_section() uses bsearch() so a binary search > + for each symbol. Binary search will at worst be O(log(n)) so the > + larger TEST_MODULE_KALLSYSMS the worse the search. > + > + 2) The selftests should load C first, before B. Upon B's load towards > + the end right before we call module B's init routine we get > + complete_formation() called on the module. That will first check > + for duplicate symbols with the call to verify_exported_symbols(). > + That is when we'll force iteration on module C's insane symbol list. > + Since it has 10 * KALLSYMS_NUMSYMS it means we can first test > + just loading B without C. The amount of time it takes to load C Vs > + B can give us an idea of the impact growth of the symbol space and > + give us projection. Module A only uses one symbol from B so to allow > + this scaling in module C to be proportional, if it used more symbols > + then the first test would be doing more and increasing just the > + search space would be slightly different. The last module, module D > + will just increase the search space by twice the number of symbols in > + C so to allow for full projects. > + > + tools/testing/selftests/module/find_symbol.sh > + > + The current defaults will incur a build delay of about 7 minutes > + on an x86_64 with only 8 cores. Enable this only if you want to > + stress test find_symbol() with thousands of symbols. At the same > + time this is also useful to test building modules with thousands of > + symbols, and if BTF is enabled this also stress tests adding BTF > + information for each module. Currently enabling many more symbols > + will segfault the build system. Despite the warning, I gave this a try on m68k (cross-compiled on i7 ;-). However, I didn't notice any extra-ordinary build times. Also, when running the test manually on ARAnyM, everything runs in the blink of an eye. I didn't use the script, but ran all commands manually. I tried insmodding a/b/c/d, c/a/b, a/c/d/b. Is this expected? Thanks! $ wc -l lib/tests/module/test_kallsyms_*.c 233 lib/tests/module/test_kallsyms_a.c 22 lib/tests/module/test_kallsyms_a.mod.c 35 lib/tests/module/test_kallsyms_b.c 21 lib/tests/module/test_kallsyms_b.mod.c 1633 lib/tests/module/test_kallsyms_c.c 21 lib/tests/module/test_kallsyms_c.mod.c 3233 lib/tests/module/test_kallsyms_d.c 21 lib/tests/module/test_kallsyms_d.mod.c 5219 total Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@xxxxxxxxxxxxxx In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds