Re: [PING PATCH v7] kallsyms: new /proc/kallmodsyms with builtin modules

Nick Alcock <nick.alcock@xxxxxxxxxx> · Thu, 03 Feb 2022 14:11:41 +0000

On 2 Feb 2022, Daniel Xu told this:
> I took a quick look at the v7 cover letter (I'll take a look at
> discussion from previous versions later if I get time) and it's not
> immediately obvious to me why a stable mapping is beneficial.

(FYI: I'm updating these patches for 5.17-rc2 right now, and will be
mailing them out once I've given them a spin. There are a couple of
bugfixes too.)

> Nick, could you elaborate why it's beneficial for dtrace to have a
> stable mapping?

Simply because when a symbol appears in both module names and the core
kernel, users can able to specify which symbol they mean via the
module`symbol syntax (the core kernel is of course called vmlinux`).
There are thousands of duplicates, so this can and does come up.

DTrace goes to some lengths to make D scripts portable not just across
.config's but across kernel releases (the whole translator mechanism
exists to translate kernel data structures into a release-independent
and as far as possible operating-system-independent form): it would be
rather silly if we could handle task_struct changing (which we can) but
not handle someone taking ext4.ko and changing .config so that it was
built in without having to review all their D scripts for references to
ext4. It would be even sillier if they suddenly found that a symbol
they were referencing in D scripts when ext4 was built as a module was
suddenly un-referenceable when it was built-in because there are already
symbols with that name in other built-in modules: in /proc/kallsyms you
can't tell such symbols apart: they differ only by address, while in
/proc/kallmodsyms you can at least tell that they came from different
modules when they were built into the core kernel.

(In fact this isn't even going far enough: in the long term, I'd like to
arrange to have *no duplicates at all* in /proc/kallmodsyms, but that
would mean that clashing symbols in different TUs in the same module
would need some sort of per-translation-unit markup, and I'm not sure
what syntax to use for that yet. It would be very cheap if we used the
same approach we're using here, literally one copy of each TU name and
one pointer for each.)

> For what it's worth, bpftrace uses /proc/kallsyms rather rarely.
> bpftrace relies on perf_event_open()'s config1 parameter to resolve
> kernel symbol name to address for kprobe attachment. /proc/kallsyms is
> mostly used to resolve kaddr() calls in bpftrace scripts.
>
> Kernel symbol size information would be useful, though. bpftrace
> currently uses the vmlinux ELF to acquire that information.

Yeah, that's a perfectly reasonable place to get that from. I'll have to
see if we can do the same thing, since courtesy of /proc/kall(mod)syms
we have access to the symbol index. This would obviate the symbol size
patch in this series, which is the only one with a nontrivial space cost
and the only one I'm unhappy with (it needs 4 bytes/symbol rather than a
few bytes per translation unit full of symbols).

I don't see any way to get the kallmodsyms per-builtin-module thing the
same way (also, it seems to me it would be much less convenient than
having it available directly in /proc almost for free).

-- 
NULL && (void)