Hi Dave, On 11/4/21 18:57, John David Anglin wrote: > On 2021-11-03 5:08 p.m., Helge Deller wrote: >> On 11/3/21 21:12, John David Anglin wrote: >>> I think the real problem is that neither flush_kernel_vmap_range() or >>> invalidate_kernel_vmap_range() flush the icache. They only operate >>> on the data cache. flush_icache_range will flush both caches. >> Yes. >> But we write the new instructions to a congruently memory are (same >> physical memory like the kernel code), then flush/invalidate the >> D-Cache, and finally flush the I-cache of kernel code memory. >> See last function call of __patch_text_multiple(). >> >> So, logically I think it should work (and it does on PA2.x). > > I still believe it is incorrect to use invalidate_kernel_vmap_range() to flush the writes in > __patch_text_multiple() to memory. Both the PA 1.1 and 2.0 architecture documents state that > it is implementation dependent whether pdc writes back dirty lines to memory at priority 0. > If the writes are not flushed to memory, the patch won't install. > >> Or do you mean to flush the I-Cache of both mappings? > > I reviewed the flush operations in __patch_text_multiple(). I believe the current code is more or > less correct, but not optimal. I believe the final call to flush_icache_range() is unnecessary. We > can also eliminate one range flush in the calls to make sure we don't have any aliases in the cache. > See change attached below. > > The big problem with __patch_text_multiple() is can only patch code that the patch code doesn't depend > on. This line in __patch_text_multiple() will cause a TLB fault on the first execution after a new > patch_map is setup. We do alternative patching to the TLB handler when ALT_COND_NO_SMP is true. I believe the alternative code patching isn't critical. It just patches in NOPs, so even if the newly patched NOPs aren't visible yet (when the TLB handler is executed), it uses the old instructions which shouldn't harm anything. They were executed before the whole time. I do agree, that patching other code paths (with non-NOPs) could be critial. By the way, to narrow down the problem, I do boot those tests here with the "no-alternatives" kernel parameter, which effectively disables the alternatives-patching. > For example, > > .macro ptl_lock spc,ptp,pte,tmp,tmp1,fault > #ifdef CONFIG_TLB_PTLOCK > 98: cmpib,COND(=),n 0,\spc,2f > get_ptl \tmp > 1: LDCW 0(\tmp),\tmp1 > cmpib,COND(=) 0,\tmp1,1b > nop > LDREG 0(\ptp),\pte > bb,<,n \pte,_PAGE_PRESENT_BIT,3f > b \fault > stw \spc,0(\tmp) > 99: ALTERNATIVE(98b, 99b, ALT_COND_NO_SMP, INSN_NOP) > #endif > 2: LDREG 0(\ptp),\pte > bb,>=,n \pte,_PAGE_PRESENT_BIT,\fault > 3: > .endm > > If the region being patched spans a page boundary, we will get a TLB fault with partially patched TLB > code. I suspect something like this is the real issue with the fixmap code. > > Maybe this could be avoided by patching in a "b,n 99f" instruction at 98b. That would avoid patching > in multiple nop instructions. It's already being done that way. See alternative.c: /* * Replace instruction with NOPs? * For long distance insert a branch instruction instead. */ if (replacement == INSN_NOP && len > 1) replacement = 0xe8000002 + (len-2)*8; /* "b,n .+8" */ >>> On 2021-10-31 5:14 p.m., Helge Deller wrote: >>>> On PA1.x machines it's not sufficient to just flush the data and >>>> instruction caches when we have written new instruction codes into the >>>> parallel mapped memory segment, but we really need to invalidate (purge) >>>> the cache too. Otherwise the processor will still execute the old >>>> instructions which are still in the data/instruction cache. >>>> >>>> Signed-off-by: Helge Deller <deller@xxxxxx> >>>> Fixes: 4e87ace902cf ("parisc: add support for patching multiple words") >>>> Cc: stable@xxxxxxxxxxxxxxx # v5.3+ >>>> >>>> diff --git a/arch/parisc/kernel/patch.c b/arch/parisc/kernel/patch.c >>>> index 80a0ab372802..8cbb7e1d5a2b 100644 >>>> --- a/arch/parisc/kernel/patch.c >>>> +++ b/arch/parisc/kernel/patch.c >>>> @@ -81,7 +81,7 @@ void __kprobes __patch_text_multiple(void *addr, u32 *insn, unsigned int len) >>>> * We're crossing a page boundary, so >>>> * need to remap >>>> */ >>>> - flush_kernel_vmap_range((void *)fixmap, >>>> + invalidate_kernel_vmap_range((void *)fixmap, >>>> (p-fixmap) * sizeof(*p)); >>>> if (mapped) >>>> patch_unmap(FIX_TEXT_POKE0, &flags); >>>> @@ -90,9 +90,10 @@ void __kprobes __patch_text_multiple(void *addr, u32 *insn, unsigned int len) >>>> } >>>> } >>>> >>>> - flush_kernel_vmap_range((void *)fixmap, (p-fixmap) * sizeof(*p)); >>>> + invalidate_kernel_vmap_range((void *)fixmap, (p-fixmap) * sizeof(*p)); >>>> if (mapped) >>>> patch_unmap(FIX_TEXT_POKE0, &flags); >>>> + invalidate_kernel_vmap_range((void *)start, end - start); >>>> flush_icache_range(start, end); >> > > --- > diff --git a/arch/parisc/kernel/patch.c b/arch/parisc/kernel/patch.c > index 80a0ab372802..799795bc4210 100644 > --- a/arch/parisc/kernel/patch.c > +++ b/arch/parisc/kernel/patch.c > @@ -67,8 +67,8 @@ void __kprobes __patch_text_multiple(void *addr, u32 *insn, unsigned int len) > int mapped; > > /* Make sure we don't have any aliases in cache */ > - flush_kernel_vmap_range(addr, len); > flush_icache_range(start, end); > + flush_tlb_kernel_range(start, end); > > p = fixmap = patch_map(addr, FIX_TEXT_POKE0, &flags, &mapped); > > @@ -93,7 +93,6 @@ void __kprobes __patch_text_multiple(void *addr, u32 *insn, unsigned int len) > flush_kernel_vmap_range((void *)fixmap, (p-fixmap) * sizeof(*p)); > if (mapped) > patch_unmap(FIX_TEXT_POKE0, &flags); > - flush_icache_range(start, end); > } > > void __kprobes __patch_text(void *addr, u32 insn) Here is the syslog with your patch on the 715/64: ... Inode-cache hash table entries: 16384 (order: 4, 65536 bytes, linear) mem auto-init: stack:off, heap alloc:off, heap free:off Memory: 148832K/163840K available (8452K kernel code, 1461K rwdata, 2379K rodata, 696K init, 444K bss, 15008K reserved, 0K cma-reserved) ftrace: allocating 26912 entries in 53 pages Backtrace: [<1097d588>] __patch_text+0x20/0x30 [<101c5128>] ftrace_make_nop+0x3c/0x68 [<102c98a8>] ftrace_process_locs.isra.0+0x208/0x2b0 [<10113ba0>] ftrace_init+0xa8/0x154 [<10101284>] start_kernel+0x3d0/0x708 [<10105244>] start_parisc+0xb8/0xec Bad Address (null pointer deref?): Code=15 (Data TLB miss fault) at addr 0effe310 CPU: 0 PID: 0 Comm: swapper Not tainted 5.15.0-32bit+ #1017 Hardware name: 9000/715 YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI PSW: 00000000000011000000001100001110 Not tainted r00-03 000c030e 10c7cda8 1097d474 10c67340 r04-07 10100314 00000000 10c672dc 10c7d328 r08-11 00000000 00000000 10cbe5a8 00000000 r12-15 10c75da8 10b2f800 10d7d5a8 00000001 r16-19 00000000 00000002 00000000 08000240 r20-23 10c67000 00000200 10100500 00000020 r24-27 0efff000 0efff000 0effe310 10b915a8 r28-31 0effe310 00000000 10c673c0 10279a68 sr00-03 00000000 00000000 00000000 00000000 sr04-07 00000000 00000000 00000000 00000000 IASQ: 00000000 00000000 IAOQ: 1097d4e4 1097d480 IIR: 0f9312a8 ISR: 00000000 IOR: 0effe310 CPU: 0 CR30: 10c67000 CR31: f00effff ORIG_R28: 00000000 IAOQ[0]: __patch_text_multiple+0xdc/0x12c IAOQ[1]: __patch_text_multiple+0x78/0x12c RP(r2): __patch_text_multiple+0x6c/0x12c Backtrace: [<1097d588>] __patch_text+0x20/0x30 [<101c5128>] ftrace_make_nop+0x3c/0x68 [<102c98a8>] ftrace_process_locs.isra.0+0x208/0x2b0 [<10113ba0>] ftrace_init+0xa8/0x154 [<10101284>] start_kernel+0x3d0/0x708 [<10105244>] start_parisc+0xb8/0xec Kernel panic - not syncing: Bad Address (null pointer deref?) Helge