Re: [PATCH] mm: migrate: fix getting incorrect page mapping during page migration

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Dec 15, 2023 at 08:07:52PM +0800, Baolin Wang wrote:
> When running stress-ng testing, we found below kernel crash after a few hours:
> 
> Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
> pc : dentry_name+0xd8/0x224
> lr : pointer+0x22c/0x370
> sp : ffff800025f134c0
> ......
> Call trace:
>   dentry_name+0xd8/0x224
>   pointer+0x22c/0x370
>   vsnprintf+0x1ec/0x730
>   vscnprintf+0x2c/0x60
>   vprintk_store+0x70/0x234
>   vprintk_emit+0xe0/0x24c
>   vprintk_default+0x3c/0x44
>   vprintk_func+0x84/0x2d0
>   printk+0x64/0x88
>   __dump_page+0x52c/0x530
>   dump_page+0x14/0x20

[...]

> There are seveval ways to fix this issue:
> (1) Setting the PAGE_MAPPING_ANON flag for target page's ->mapping when saving
> 'anon_vma', but this can confuse PageAnon() for PFN walkers, since the target
> page has not built mappings yet.
> (2) Getting the page lock to call page_mapping() in __dump_page() to avoid crashing
> the system, however, there are still some PFN walkers that call page_mapping()
> without holding the page lock, such as compaction.
> (3) Using target page->private field to save the 'anon_vma' pointer and 2 bits
> page state, just as page->mapping records an anonymous page, which can remove
> the page_mapping() impact for PFN walkers and also seems a simple way.
> 
> So I choose option 3 to fix this issue, and this can also fix other potential
> issues for PFN walkers, such as compaction.

I'm not saying no to this fix, but dump_mapping() is supposed to be
resilient against this.  Is the issue that 'dentry' is NULL, or is it
some field within dentry that is NULL?  eg, would this fix your
case?

+++ b/fs/inode.c
@@ -588,7 +588,7 @@ void dump_mapping(const struct address_space *mapping)
        }

        dentry_ptr = container_of(dentry_first, struct dentry, d_u.d_alias);
-       if (get_kernel_nofault(dentry, dentry_ptr)) {
+       if (get_kernel_nofault(dentry, dentry_ptr) || !dentry) {
                pr_warn("aops:%ps ino:%lx invalid dentry:%px\n",
                                a_ops, ino, dentry_ptr);
                return;

Just to be clear, I think we should fix both the dumping and the migration
code.




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux