Hi Philipp On Fri, 24 Mar 2023 at 17:00, Philipp Rudo <prudo@xxxxxxxxxx> wrote: > > Hi Ricardo, > > On Wed, 22 Mar 2023 20:09:21 +0100 > Ricardo Ribalda <ribalda@xxxxxxxxxxxx> wrote: > > > Clang16 links the purgatory text in two sections: > > > > [ 1] .text PROGBITS 0000000000000000 00000040 > > 00000000000011a1 0000000000000000 AX 0 0 16 > > [ 2] .rela.text RELA 0000000000000000 00003498 > > 0000000000000648 0000000000000018 I 24 1 8 > > ... > > [17] .text.hot. PROGBITS 0000000000000000 00003220 > > 000000000000020b 0000000000000000 AX 0 0 1 > > [18] .rela.text.hot. RELA 0000000000000000 00004428 > > 0000000000000078 0000000000000018 I 24 17 8 > > > > And both of them have their range [sh_addr ... sh_addr+sh_size] on the > > area pointed by `e_entry`. > > > > This causes that image->start is calculated twice, once for .text and > > another time for .text.hot. The second calculation leaves image->start > > in a random location. > > > > Because of this, the system crashes inmediatly after: > > > > kexec_core: Starting new kernel > > Great analysis! > > > Signed-off-by: Ricardo Ribalda <ribalda@xxxxxxxxxxxx> > > --- > > kexec: Fix kexec_file_load for llvm16 > > > > When upreving llvm I realised that kexec stopped working on my test > > platform. This patch fixes it. > > > > To: Eric Biederman <ebiederm@xxxxxxxxxxxx> > > Cc: Baoquan He <bhe@xxxxxxxxxx> > > Cc: Philipp Rudo <prudo@xxxxxxxxxx> > > Cc: kexec@xxxxxxxxxxxxxxxxxxx > > Cc: linux-kernel@xxxxxxxxxxxxxxx > > --- > > Changes in v3: > > - Fix initial value. Thanks Ross! > > - Link to v2: https://lore.kernel.org/r/20230321-kexec_clang16-v2-0-d10e5d517869@xxxxxxxxxxxx > > > > Changes in v2: > > - Fix if condition. Thanks Steven!. > > - Update Philipp email. Thanks Baoquan. > > - Link to v1: https://lore.kernel.org/r/20230321-kexec_clang16-v1-0-a768fc2c7c4d@xxxxxxxxxxxx > > --- > > kernel/kexec_file.c | 13 ++++++++++++- > > 1 file changed, 12 insertions(+), 1 deletion(-) > > > > diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c > > index f1a0e4e3fb5c..25a37d8f113a 100644 > > --- a/kernel/kexec_file.c > > +++ b/kernel/kexec_file.c > > @@ -901,10 +901,21 @@ static int kexec_purgatory_setup_sechdrs(struct purgatory_info *pi, > > } > > > > offset = ALIGN(offset, align); > > + > > + /* > > + * Check if the segment contains the entry point, if so, > > + * calculate the value of image->start based on it. > > + * If the compiler has produced more than one .text sections > > + * (Eg: .text.hot), they are generally after the main .text > > + * section, and they shall not be used to calculate > > + * image->start. So do not re-calculate image->start if it > > + * is not set to the initial value. > > + */ > > if (sechdrs[i].sh_flags & SHF_EXECINSTR && > > pi->ehdr->e_entry >= sechdrs[i].sh_addr && > > pi->ehdr->e_entry < (sechdrs[i].sh_addr > > - + sechdrs[i].sh_size)) { > > + + sechdrs[i].sh_size) && > > + kbuf->image->start == pi->ehdr->e_entry) { > > I'm not entirely sure if this is the solution to go with. As you state > in the comment above this solution assumes that the .text section comes > before any other .text.* section. But this assumption isn't much > stronger than the assumption that there is only a single .text section, > which is used nowadays. > > The best solution I can come up with right now is to introduce a linker > script for the purgatory that simply merges the .text sections into > one. Similar to what I did for s390 in > arch/s390/purgatory/purgatory.lds.S (although for a different reason). > But that would require every architecture to get one. An alternative > would be to find a way to get rid of the -r option on the LD_FLAGS, > which IIRC is the reason why both section overlap in the first place. I tried removing the -r from arch/x86/purgatory/Makefile and that resulted into: [ 115.631578] BUG: unable to handle page fault for address: ffff93224d5c8e20 [ 115.631583] #PF: supervisor write access in kernel mode [ 115.631585] #PF: error_code(0x0002) - not-present page [ 115.631586] PGD 100000067 P4D 100000067 PUD 1001ed067 PMD 132b58067 PTE 0 [ 115.631589] Oops: 0002 [#1] PREEMPT SMP NOPTI [ 115.631592] CPU: 0 PID: 5291 Comm: kexec-lite Tainted: G U 5.15.103-17399-g852a928df601-dirty #19 cd159e0d6a91f03e06035a0a8eb7fc984a8f3e82 [ 115.631594] Hardware name: Google Crota/Crota, BIOS Google_Crota.14505.288.0 11/08/2022 [ 115.631595] RIP: 0010:memcpy_erms+0x6/0x10 [ 115.631599] Code: 5d 00 eb bd eb 1e 0f 1f 00 48 89 f8 48 89 d1 48 c1 e9 03 83 e2 07 f3 48 a5 89 d1 f3 a4 c3 cc cc cc cc 66 90 48 89 f8 48 89 d1 <f3> a4 c3 cc cc cc cc 0f 1f 00 48 89 f8 48 83 fa 20 72 7e 40 38 fe [ 115.631601] RSP: 0018:ffff93224f65fe50 EFLAGS: 00010246 [ 115.631602] RAX: ffff93224d5c8e20 RBX: 00000000ffffffea RCX: 0000000000000100 [ 115.631603] RDX: 0000000000000100 RSI: ffff9322407bd000 RDI: ffff93224d5c8e20 [ 115.631604] RBP: ffff93224f65fe88 R08: 0000000000000000 R09: ffff92133cd3ef08 [ 115.631605] R10: ffff9322407be000 R11: ffffffffa1b4f2e0 R12: 0000000000000000 [ 115.631606] R13: ffff92133cee4c00 R14: 0000000000000100 R15: ffffffffa2b6f14f [ 115.631607] FS: 000078e8b9dbf7c0(0000) GS:ffff921437800000(0000) knlGS:0000000000000000 [ 115.631609] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 115.631610] CR2: ffff93224d5c8e20 CR3: 000000015be26001 CR4: 0000000000770ef0 [ 115.631611] PKRU: 55555554 [ 115.631612] Call Trace: [ 115.631614] <TASK> [ 115.631615] kexec_purgatory_get_set_symbol+0x82/0xd3 [ 115.631619] __se_sys_kexec_file_load+0x523/0x644 [ 115.631621] do_syscall_64+0x58/0xa5 [ 115.631623] entry_SYSCALL_64_after_hwframe+0x61/0xcb And I did not continue in that direction. I also tried finding a flag for llvm that would avoid splitting .text, but was not lucky either. I will look into making a linker script for x86, we could combine it with something like: if (sechdrs[i].sh_flags & SHF_EXECINSTR && pi->ehdr->e_entry >= sechdrs[i].sh_addr && pi->ehdr->e_entry < (sechdrs[i].sh_addr - + sechdrs[i].sh_size) && - kbuf->image->start == pi->ehdr->e_entry) { - kbuf->image->start -= sechdrs[i].sh_addr; - kbuf->image->start += kbuf->mem + offset; + + sechdrs[i].sh_size)) { + if (!WARN_ON(kbuf->image->start != pi->ehdr->e_entry)) { + kbuf->image->start -= sechdrs[i].sh_addr; + kbuf->image->start += kbuf->mem + offset; + } } So developers have some hints of what to look at. Thanks! > > Thanks > Philipp > > > kbuf->image->start -= sechdrs[i].sh_addr; > > kbuf->image->start += kbuf->mem + offset; > > } > > > > --- > > base-commit: 17214b70a159c6547df9ae204a6275d983146f6b > > change-id: 20230321-kexec_clang16-4510c23d129c > > > > Best regards, > -- Ricardo Ribalda _______________________________________________ kexec mailing list kexec@xxxxxxxxxxxxxxxxxxx http://lists.infradead.org/mailman/listinfo/kexec