Re: [PATCH v5 3/3] mm,hwpoison: Send SIGBUS with error virutal address

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, May 21, 2021 at 12:01:56PM +0900, Naoya Horiguchi wrote:
> From: Naoya Horiguchi <naoya.horiguchi@xxxxxxx>
> 
> Now an action required MCE in already hwpoisoned address surely sends a
> SIGBUS to current process, but the SIGBUS doesn't convey error virtual
> address.  That's not optimal for hwpoison-aware applications.
> 
> To fix the issue, make memory_failure() call kill_accessing_process(),
> that does pagetable walk to find the error virtual address.  It could
> find multiple virtual addresses for the same error page, and it seems
> hard to tell which virtual address is correct one.  But that's rare
> and sending incorrect virtual address could be better than no address.
> So let's report the first found virtual address for now.
> 
> Signed-off-by: Naoya Horiguchi <naoya.horiguchi@xxxxxxx>
> ---
> change log v4 -> v5:
> - switched to first found approach,
> - introduced check_hwpoisoned_pmd_entry() to fix build failure on arch
>   without thp support.
> 
> change log v3 -> v4:
> - refactored hwpoison_pte_range to save indentation,
> - updated patch description
> 
> change log v1 -> v2:
> - initialize local variables in check_hwpoisoned_entry() and
>   hwpoison_pte_range()
> - fix and improve logic to calculate error address offset.
> ---
...
> +static int kill_accessing_process(struct task_struct *p, unsigned long pfn,
> +				  int flags)
> +{
> +	int ret;
> +	struct hwp_walk priv = {
> +		.pfn = pfn,
> +	};
> +	priv.tk.tsk = p;
> +
> +	mmap_read_lock(p->mm);
> +	ret = walk_page_range(p->mm, 0, TASK_SIZE, &hwp_walk_ops,
> +			      (void *)&priv);
> +	if (!ret && priv.tk.addr)

Sorry, I found a silly mistake, the walk_page_range() got to return 1 when it
found at least error virtual address since v5, so this if-condition should be
like this.

@@ -691,7 +691,8 @@ static int kill_accessing_process(struct task_struct *p, unsigned long pfn,
        mmap_read_lock(p->mm);
        ret = walk_page_range(p->mm, 0, TASK_SIZE, &hwp_walk_ops,
                              (void *)&priv);
-       if (!ret && priv.tk.addr)
+       if (ret == 1 && priv.tk.addr)
                kill_proc(&priv.tk, pfn, flags);
        mmap_read_unlock(p->mm);
        return ret ? -EFAULT : -EHWPOISON;

Andrew, this patch is now in linux-mm, so could you apply this fix onto
mmhwpoison-send-sigbus-with-error-virutal-address.patch ?
Or if it's better to resend a whole patch, please let me know.

Thanks,
Naoya Horiguchi




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux