Re: [POC][RFC][PATCH] build: Make weak functions visible in kallsyms

Steven Rostedt <rostedt@xxxxxxxxxxx> · Fri, 27 Dec 2024 14:19:22 -0500

On Fri, 27 Dec 2024 10:09:40 -0800
Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:

> On Thu, 26 Dec 2024 at 19:45, Steven Rostedt <rostedt@xxxxxxxxxxx> wrote:
> >  
> > >
> > > Btw, does this actually happen when the compiler does the mcount thing for us?  
> >
> > Yes.  
> 
> Ok, that's actually good.
> 
> I'm not really worried about the "unused symbols aren't in kallsyms"
> issue, even if it confuses the mcount logic. THAT confusion is easy to
> deal with by either adding symbol size information (which I think
> would be a good thing in general, although perhaps not worth it).
> 
> Even without the symbol size, the mcount issue can be dealt with by
> just knowing that the mcount location has to be at the very beginning
> of the function and just taking the offset - that we already do have -
> into account.

This is actually what we do today. A check is made against the location of
the fentry/mcount, and if it's too far away from the function entry, it is
considered a weak function. The distance is architecture dependent, as well
as options (adding ENDBR and such needs to be accounted for).

Now we still need to list them in the available_filter_functions, as the
order of what's in mcount_loc is used by the set_filter_function, and if we
"hide" any, it can screw up the accounting, because you can enable
functions via the index into that file. Using an index is an O(1)
operation, where as by name it needs to search all addresses and call
kallsyms for a name look up and then compare to what was passed in. Tooling
uses the index, as it's much faster to enable several functions (which by
name turns into an O(n^2) operation). Enabling a thousand functions by name
can take over a minute to complete, whereas by index takes less than a
second.

For now, we have in available_filter_functions:

  trace_initcall_finish_cb
  initcall_blacklisted
  do_one_initcall
  __ftrace_invalid_address___708
  rootfs_init_fs_context
  wait_for_initramfs
  __ftrace_invalid_address___68
  calibration_delay_done
  calibrate_delay

Where those "__ftrace_invalid_address_*" is detected as a weak function.
This is ugly and a hack, but we still need a place holder for them :-(
I would love to find a better way to handle this.

Now, the .mcount_loc is sorted at build time. If there's a way to find
which addresses are no longer visible, then that code could possibly remove
them from the mcount_loc. Perhaps, it could do a 'nm -S' and check to make
sure that all addresses are within a listed size, and if not, remove it.
That way, we could get rid of those place holders.

> 
> I was more worried that there might also be some much deeper confusion
> with the linker actually garbage collecting the unused weak function
> code away, and now an unused symbol that kallsyms doesn't know about
> wouldn't just have an unexpected mcount pointer to it, but the mcount
> pointer would actually be stale and point to some unrelated code.
> 
> So as long as *that* isn't what is happening, this all seems fairly benign.

For now, the mcount_loc is just an annoyance as we have those ugly place
holders. The real bug is with live kernel patching, where there's a
mismatch in the function size. IIUC, the building of the module for the
live patching finds the size of the function it's patching with 'nm'. And
then before patching, asks kallsyms for the size of the function. Because
kallsyms uses the next function to determine the size, it returns the size
of the function plus the size of the weak function. There's a mismatch and
live kernel patching refuses to patch the function.

By adding the actual size of the function to kallsyms, that would fix the
problem. The size doesn't need to be exported to /proc/kallsyms, as you
mentioned before, all users happen to be in-kernel.

The one downsize of storing the numbers, is that we will need to store a
size for every function, and that can add up.

 $ wc -l /proc/kallsyms 
 208784 /proc/kallsyms

That's 208 thousand numbers to store!

Perhaps this is the one advantage of having a weak placeholder in kallsyms
as well. It may save memory.

-- Steve