Nam Cao <namcaov@xxxxxxxxx> writes: > On Sun, Aug 27, 2023 at 11:04:34AM +0200, Björn Töpel wrote: >> Nam Cao <namcaov@xxxxxxxxx> writes: >> >> > On Sun, Aug 27, 2023 at 10:11:25AM +0200, Björn Töpel wrote: >> >> The default implementation of is_trap_insn() which RISC-V is using calls >> >> is_swbp_insn(), which is doing what your patch does. Your patch does not >> >> address the issue. >> > >> > is_swbp_insn() does this: >> > >> > #ifdef CONFIG_RISCV_ISA_C >> > return (*insn & 0xffff) == UPROBE_SWBP_INSN; >> > #else >> > return *insn == UPROBE_SWBP_INSN; >> > #endif >> > >> > ...so it doesn't even check for 32-bit ebreak if C extension is on. My patch >> > is not the same. >> >> Ah, was too quick. >> >> AFAIU uprobes *always* uses c.ebreak when CONFIG_RISCV_ISA_C is set, and >> ebreak otherwise. That's the reason is_swbp_insn() is implemented like >> that. > > That's what I understand too. > >> If that's not the case, there's a another bug, that your patches >> addresses. > > I think it's a bug regardless. is_trap_insn() is used by uprobes to figure out > if there is an instruction that generates trap exception, not just instructions > that are "SWBP". The reason is because when there is a trap, but uprobes doesn't > see a probe installed here, it needs is_trap_insn() to figure out if the trap > is generated by ebreak from something else, or because the probe is just removed. > In the latter case, uprobes will return back, because probe has already been removed, > so it should be safe to do so. That's why I think the incorrect is_swbp_insn() > would cause a hang, because uprobes incorrectly thinks there is no ebreak there, > so it should be okay to go back, but there actually is. > > So, from my understanding, if uprobes encounter a 32-bit ebreak for any reason, > the kernel would hang. I think your patch is a great addition nonetheless, but I > am guessing that it only masks the problem by preventing uprobes from seeing the > 32-bit ebreak in the specific test, not really solve it. So, if there is a 32-bit > ebreak in userspace, the bug still causes the kernel to hang. > > I am still quite confident of my logic, so I would be very suprised if my fix > doesn't solve the reported hang. Do you mind testing my patch? My potato of a > laptop unfortunately cannot run the test :( Maybe I wasn't clear, sorry for that! I did take the patch for a spin, and it did not solve this particular problem. When we're taking a trap from *kernel*mode, we should never deal with uprobes at all. Have a look at uprobe_pre_sstep_notifier(), this function returns 1, which then means that the trap handler exit premature. The code you're referring to (called from uprobe_notify_resume()), and will never be entered, because we're not exiting the trap to userland. Have a look in kernel/entry/common.c (search for e.g. TIF_UPROBE). Now, for your concern, which I see as a potential different bug. Not at all related to my issue "trap from kernelmode touches uprobe incorrectly"; A "random" ebreak from *userland* is trapped, when uprobes is enabled will set the kernel in a hang. I suggest you construct try to write a simple program to reproduce this! I had a quick look in the uprobe handling code, and AFAIU the was used when installing the uprobe as an additional check, and when searching for an active uprobe. I'm still a bit puzzled how the issue you're describing could trigger. A reproducer would help! Cheers, Björn