Hi Luis,
On 12/22/23 21:10, Luis Chamberlain wrote:
On Fri, Dec 22, 2023 at 01:13:26PM +0100, Helge Deller wrote:
On 12/22/23 06:59, Luis Chamberlain wrote:
On Wed, Nov 22, 2023 at 11:18:12PM +0100, deller@xxxxxxxxxx wrote:
On 64-bit architectures without CONFIG_HAVE_ARCH_PREL32_RELOCATIONS
(e.g. ppc64, ppc64le, parisc, s390x,...) the __KSYM_REF() macro stores
64-bit pointers into the __ksymtab* sections.
Make sure that those sections will be correctly aligned at module link time,
otherwise unaligned memory accesses may happen at runtime.
...
...
So, honestly I don't see a real reason why it shouldn't be applied...
Like I said, you Cc'd stable as a fix,
I added "Cc: stable@xxxxxxxxxxxxxxx" on the patch itself, so *if* the patch
would have been applied by you, it would later end up in stable kernel series too.
But I did not CC'ed the stable mailing list directly, so my patch was never
sent to that mailing list.
as a maintainer it is my job to
verify how critical this is and ask for more details about how you found
it and evaluate the real impact. Even if it was not a stable fix I tend
to ask this for patches, even if they are trivial.
...
OK, can you extend the patch below with something like:
perf stat --repeat 100 --pre 'modprobe -r b a b c' -- ./tools/testing/selftests/module/find_symbol.sh
And test before and after?
I ran a simple test as-is and the data I get is within noise, and so
I think we need the --repeat 100 thing.
Your selftest code is based on perf.
AFAICS we don't have perf on parisc/hppa, so I can't test your selftest code
on that architecture.
I assume you tested on x86, where the CPU will transparently take care of
unaligned accesses. This is probably why the results are within
the noise.
But on some platforms the CPU raises an exception on unaligned accesses
and jumps into special exception handler assembler code inside the kernel.
This is much more expensive than on x86, which is why we track on parisc
in /proc/cpuinfo counters on how often this exception handler is called:
IRQ: CPU0 CPU1
3: 1332 0 SuperIO ttyS0
7: 1270013 0 SuperIO pata_ns87415
64: 320023012 320021431 CPU timer
65: 17080507 20624423 CPU IPI
UAH: 10948640 58104 Unaligned access handler traps
This "UAH" field could theoretically be used to extend your selftest.
But is it really worth it? The outcome is very much architecture and CPU
specific, maybe it's just within the noise as you measured.
IMHO we should always try to natively align structures, and if we see
we got it wrong in kernel code, we should fix it.
My patches just fix those memory sections where we use inline
assembly (instead of C) and thus missed to provide the correct alignments.
Helge
-----------------------------------------------------------------------------------
before:
sudo ./tools/testing/selftests/module/find_symbol.sh
Performance counter stats for '/sbin/modprobe test_kallsyms_b':
81,956,206 ns duration_time
81,883,000 ns system_time
210 page-faults
0.081956206 seconds time elapsed
0.000000000 seconds user
0.081883000 seconds sys
Performance counter stats for '/sbin/modprobe test_kallsyms_b':
85,960,863 ns duration_time
84,679,000 ns system_time
212 page-faults
0.085960863 seconds time elapsed
0.000000000 seconds user
0.084679000 seconds sys
Performance counter stats for '/sbin/modprobe test_kallsyms_b':
86,484,868 ns duration_time
86,541,000 ns system_time
213 page-faults
0.086484868 seconds time elapsed
0.000000000 seconds user
0.086541000 seconds sys
-----------------------------------------------------------------------------------
After your modules alignement fix:
sudo ./tools/testing/selftests/module/find_symbol.sh
Performance counter stats for '/sbin/modprobe test_kallsyms_b':
83,579,980 ns duration_time
83,530,000 ns system_time
212 page-faults
0.083579980 seconds time elapsed
0.000000000 seconds user
0.083530000 seconds sys
Performance counter stats for '/sbin/modprobe test_kallsyms_b':
70,721,786 ns duration_time
69,289,000 ns system_time
211 page-faults
0.070721786 seconds time elapsed
0.000000000 seconds user
0.069289000 seconds sys
Performance counter stats for '/sbin/modprobe test_kallsyms_b':
76,513,219 ns duration_time
76,381,000 ns system_time
214 page-faults
0.076513219 seconds time elapsed
0.000000000 seconds user
0.076381000 seconds sys
After your modules alignement fix:
sudo ./tools/testing/selftests/module/find_symbol.sh
Performance counter stats for '/sbin/modprobe test_kallsyms_b':
83,579,980 ns duration_time
83,530,000 ns system_time
212 page-faults
0.083579980 seconds time elapsed
0.000000000 seconds user
0.083530000 seconds sys
Performance counter stats for '/sbin/modprobe test_kallsyms_b':
70,721,786 ns duration_time
69,289,000 ns system_time
211 page-faults
0.070721786 seconds time elapsed
0.000000000 seconds user
0.069289000 seconds sys
Performance counter stats for '/sbin/modprobe test_kallsyms_b':
76,513,219 ns duration_time
76,381,000 ns system_time
214 page-faults
0.076513219 seconds time elapsed
0.000000000 seconds user
0.076381000 seconds sys
-----------------------------------------------------------------------------------
[perf-based selftest patch from Luis stripped]