Heiko Carstens <hca@xxxxxxxxxxxxx> writes: > On Fri, Jan 21, 2022 at 10:46:36AM +0100, Sven Schnelle wrote: >> Hi Yinan, >> >> Yinan Liu <yinan@xxxxxxxxxxxxxxxxx> writes: >> >> > When the kernel starts, the initialization of ftrace takes >> > up a portion of the time (approximately 6~8ms) to sort mcount >> > addresses. We can save this time by moving mcount-sorting to >> > compile time. >> > >> > Signed-off-by: Yinan Liu <yinan@xxxxxxxxxxxxxxxxx> >> > Reported-by: kernel test robot <lkp@xxxxxxxxx> >> > Reported-by: kernel test robot <oliver.sang@xxxxxxxxx> >> > --- >> > kernel/trace/ftrace.c | 11 +++- >> > scripts/Makefile | 6 +- >> > scripts/link-vmlinux.sh | 6 +- >> > scripts/sorttable.c | 2 + >> > scripts/sorttable.h | 120 +++++++++++++++++++++++++++++++++++++++- >> > 5 files changed, 137 insertions(+), 8 deletions(-) >> >> while i like the idea, this unfortunately breaks ftrace on s390. The >> reason for that is that the compiler generates relocation entries for >> all the addresses in __mcount_loc. During boot, the s390 decompressor >> iterates through all the relocations and overwrites the nicely >> sorted list between __start_mcount_loc and __stop_mcount_loc with >> the unsorted list because the relocations entries are not adjusted. >> >> Of course we could just disable that option, but that would make us >> different compared to x86 which i don't like. Adding code to sort the >> relocation would of course also fix that, but i don't think it is a good >> idea to rely on the order of relocations. >> >> Any thoughts how a fix could look like, and whether that could also be a >> problem on other architectures? > > Sven, thanks for figuring this out. Can you confirm that reverting > commit 72b3942a173c ("scripts: ftrace - move the sort-processing in > ftrace_init") fixes the problem? Yes, reverting this commit fixes it. > This really should be addressed before rc1 is out, otherwise s390 is > broken if somebody enables ftrace. > Where "broken" translates to random crashes as soon as ftrace is > enabled, which again is nowadays quite common. I wasn't able to reproduce these crashes on my systems so far. For the readers here, we're seeing about 10-15 systems crashing every night, usually in the 00basic/ ftrace testcases. In most of the case it looks like register corruption, where some random register is or'd or parts are overwritten with 0x0004000000000000, sometimes 0x00f4000000000000. I haven't found yts found a commit that might cause this. /Sven