On Wed, 10 Mar 2021 17:28:12 -0800 Andy Lutomirski <luto@xxxxxxxxxxxxxx> wrote: > On Wed, Mar 10, 2021 at 5:19 PM Aili Yao <yaoaili@xxxxxxxxxxxx> wrote: > > > > On Mon, 8 Mar 2021 11:00:28 -0800 > > Andy Lutomirski <luto@xxxxxxxxxxxxxx> wrote: > > > > > > On Mar 8, 2021, at 10:31 AM, Luck, Tony <tony.luck@xxxxxxxxx> wrote: > > > > > > > > > > > >> > > > >> Can you point me at that SIGBUS code in a current kernel? > > > > > > > > It is in kill_me_maybe(). mce_vaddr is setup when we disassemble whatever get_user() > > > > or copy from user variant was in use in the kernel when the poison memory was consumed. > > > > > > > > if (p->mce_vaddr != (void __user *)-1l) { > > > > force_sig_mceerr(BUS_MCEERR_AR, p->mce_vaddr, PAGE_SHIFT); > > > > > > Hmm. On the one hand, no one has complained yet. On the other hand, hardware that supports this isn’t exactly common. > > > > > > We may need some actual ABI design here. We also need to make sure that things like io_uring accesses or, more generally, anything using the use_mm / use_temporary_mm ends up either sending no signal or sending a signal to the right target. > > > > > > > > > > > Would it be any better if we used the BUS_MCEERR_AO code that goes into siginfo? > > > > > > Dunno. > > > > I have one thought here but don't know if it's proper: > > > > Previous patch use force_sig_mceerr to the user process for such a scenario; with this method > > The SIGBUS can't be ignored as force_sig_mceerr() was designed to. > > > > If the user process don't want this signal, will it set signal config to ignore? > > Maybe we can use a send_sig_mceerr() instead of force_sig_mceerr(), if process want to > > ignore the SIGBUS, then it will ignore that, or it can also process the SIGBUS? > > I don't think the signal blocking mechanism makes sense for this. > Blocking a signal is for saying that, if another process sends the > signal (or an async event like ctrl-C), then the process doesn't want > it. Blocking doesn't block synchronous things like faults. > > I think we need to at least fix the existing bug before we add more > signals. AFAICS the MCE_IN_KERNEL_COPYIN code is busted for kernel > threads. Got this, Thanks! I read https://man7.org/linux/man-pages/man2/write.2.html, and it seems the write syscall is not expecting an signal, maybe a specific error code for this scenario is enough. -- Thanks! Aili Yao