Re: [PATCH 2/4] modules: Ensure 64-bit alignment on __ksymtab_* sections

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Luis,

On 12/22/23 06:59, Luis Chamberlain wrote:
On Wed, Nov 22, 2023 at 11:18:12PM +0100, deller@xxxxxxxxxx wrote:
From: Helge Deller <deller@xxxxxx>

On 64-bit architectures without CONFIG_HAVE_ARCH_PREL32_RELOCATIONS
(e.g. ppc64, ppc64le, parisc, s390x,...) the __KSYM_REF() macro stores
64-bit pointers into the __ksymtab* sections.
Make sure that those sections will be correctly aligned at module link time,
otherwise unaligned memory accesses may happen at runtime.

The ramifications are not explained there.

What do you mean exactly by this?

You keep sending me patches with this and we keep doing a nose dive
on this. It means I have to do more work.
Sorry about that. Since you are mentioned as maintainer for modules I
thought you are the right contact.

Furthermore, this patch is pretty small and trivial.
And I had the impression that people understand that having unaligned
structures is generally a bad idea as it usually always impacts performance
(although the performance penalty in this case isn't critical at all,
as we are not on hot paths).

So as I had suggested with your patch which I merged in
commit 87c482bdfa79 ("modules: Ensure natural alignment for
.altinstructions and __bug_table sections") please clarify the
impact of not merging this patch. Also last time you noticed the
misalignment due to a faulty exception handler, please mention how
you found this out now.

Sure.

And since this is not your first patch on the exact related subject
I'd appreciate if you can show me perf stat results differences between
having and not having this patch merged. Why? Because we talk about
a performance penalthy, but we are not saying how much, and since this
is an ongoing thing, we might as well have a tool belt with ways to
measure such performance impact to bring clarity and value to this
and future related patches.

The __kcrctab* sections store 32-bit entities, so use ALIGN(4) for those.

I've given some thought about how to test this. Sadly perf kallsysms
just opens the /proc/kallsysms file, but that's fine, we need our own
test.

I think a 3 new simple modules selftest would do it and running perf
stat on it. One module, let us call it module A which constructs its own
name space prefix for its exported symbols and has tons of silly symbols
for arbitrary data, whatever. We then have module B which refers to a
few arbitrary symbols from module A, hopefully spread out linearly, so
if module A had 10,000 symbols, we'd have module A refer to a symbol
ever 1,000 symbols. Finally we want a symbol C which has say, 50,000
symbols all of which will not be used at all by the first two modules,
but the selftest will load module C first, prior to calling modprobe B.

We'll stress test this way two calls which use find_symbol():

1) Upon load of B it will trigger simplify_symbols() to look for the
symbol it uses from the module A with tons of symbols. That's an
indirect way for us to call resolve_symbol_wait() from module A without
having to use symbol_get() which want to remove as exported as it is
just a hack which should go away. Our goal is for us to test
resolve_symbol() which will call find_symbol() and that will eventually
look for the symbol on module A with:

   find_exported_symbol_in_section()

That uses bsearch() so a binary search for the symbol and we'd end up
hitting the misalignments here. Binary search will at worst be O(log(n))
and so the only way to aggreviate the search will be to add tons of
symbols to A, and have B use a few of them.

2) When you load B, userspace will at first load A as depmod will inform
userspace A goes before B. Upon B's load towards the end right before
we call module B's init routine we get complete_formation() called on
the module. That will first check for duplicate symbols with the call
to verify_exported_symbols(). That is when we'll force iteration on
module C's insane symbol list.

The selftests just runs

perf stat -e pick-your-poison-for-misalignments tools/testing/selftests/kmod/ksymtab.sh

Where ksymtab.sh is your new script which calls:

modprobe C
modprobe B

I say pick-your-poison-for-misalignments because I am not sure what is
best here.

Thoughts?

I really appreciate your thoughts here!

But given the triviality of this patch, I personally think you are
proposing a huge academic investment in time and resources with no real benefit.
On which architecture would you suggest to test this?
What would be the effective (really new) outcome?
If the performance penalty is zero or close to zero, will that imply
that such a patch isn't useful?
Please also keep in mind that not everyone gets paid to work on the kernel,
so I have neither the time nor the various architectures to test on.

Then, as Masahiro already mentioned, the real solution is
probably to add ".balign" to the various inline assembler parts.
With this the alignment gets recorded in the sh_addralign field
in the object files, and the linker will correctly lay out the executable
even without this patch here.
And, this is exactly what you would get if C-initializers would have
been used instead of inline assembly.

But independend of the correct way to fix it, I think the linker
script ideally should mention all sections it expects with the right
alignments. Just for completeness.

I'll leave it up to you if you will apply this patch.
Currently it's not absolutely needed any longer, but let's think about
the pros/cons of this patch:
Pro:
- It can prevent unnecessary unaligned memory accesses on all arches.
- It provides the linker with correct alignment for the various sections.
- It mimics what you would get if the structs were coded with C-initializers.
- A correct section alignment can address upcoming inline assembly patches
  which may have missed to add the .balign
Cons:
- For this specific patch the worst case scenario is that we add a total of
  up to 7 bytes to kernel image (__ksymtab gets aligned to 8 bytes and the
  following sections are then already aligned).

So, honestly I don't see a real reason why it shouldn't be applied...

Signed-off-by: Helge Deller <deller@xxxxxx>
Cc: <stable@xxxxxxxxxxxxxxx> # v6.0+

That's a stretch without any data, don't you think?

Yes. No need to push such a patch to stable series.

Helge





[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Big List of Linux Books]

  Powered by Linux