Re: [PATCH next] mm/khugepaged: fix conflicting mods to collapse_file()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, Apr 23, 2023 at 1:47 PM Hugh Dickins <hughd@xxxxxxxxxx> wrote:
>
> Inserting Ivan Orlov's syzbot fix commit 2ce0bdfebc74
> ("mm: khugepaged: fix kernel BUG in hpage_collapse_scan_file()")
> ahead of Jiaqi Yan's and David Stevens's commits
> 12904d953364 ("mm/khugepaged: recover from poisoned file-backed memory")
> cae106dd67b9 ("mm/khugepaged: refactor collapse_file control flow")
> ac492b9c70ca ("mm/khugepaged: skip shmem with userfaultfd")
> (all of which restructure collapse_file()) did not work out well.
>
> xfstests generic/086 on huge tmpfs (with accelerated khugepaged) freezes
> (if not on the first attempt, then the 2nd or 3rd) in find_lock_entries()
> while doing drop_caches: the file's xarray seems to have been corrupted,
> with find_get_entry() returning nonsense which makes no progress.
>
> Bisection led to ac492b9c70ca; and diff against earlier working linux-next
> suggested that it's probably down to an errant xas_store(), which does not
> belong with the later changes (and nor does the positioning of warnings).
> The later changes look as if they fix the syzbot issue independently.
>
> Remove most of what's left of 2ce0bdfebc74: just leave one WARN_ON_ONCE
> (xas_error) after the final xas_store() of the multi-index entry.
>
> Signed-off-by: Hugh Dickins <hughd@xxxxxxxxxx>
> ---
>
>  mm/khugepaged.c | 23 +----------------------
>  1 file changed, 1 insertion(+), 22 deletions(-)
>
> --- a/mm/khugepaged.c
> +++ b/mm/khugepaged.c
> @@ -1941,16 +1941,6 @@ static int collapse_file(struct mm_struct *mm, unsigned long addr,
>                                         result = SCAN_FAIL;
>                                         goto xa_locked;
>                                 }
> -                               xas_store(&xas, hpage);
> -                               if (xas_error(&xas)) {
> -                                       /* revert shmem_charge performed
> -                                        * in the previous condition
> -                                        */
> -                                       mapping->nrpages--;
> -                                       shmem_uncharge(mapping->host, 1);
> -                                       result = SCAN_STORE_FAILED;

With this being removed, SCAN_STORE_FAILED should also be removed from
the scan_result enum and trace event definitions.

-David

> -                                       goto xa_locked;
> -                               }
>                                 nr_none++;
>                                 continue;
>                         }
> @@ -2105,13 +2095,6 @@ static int collapse_file(struct mm_struct *mm, unsigned long addr,
>                  * Accumulate the pages that are being collapsed.
>                  */
>                 list_add_tail(&page->lru, &pagelist);
> -
> -               /*
> -                * We can't get an ENOMEM here (because the allocation happened
> -                * before) but let's check for errors (XArray implementation
> -                * can be changed in the future)
> -                */
> -               WARN_ON_ONCE(xas_error(&xas));
>                 continue;
>  out_unlock:
>                 unlock_page(page);
> @@ -2134,11 +2117,6 @@ static int collapse_file(struct mm_struct *mm, unsigned long addr,
>                 }
>         }
>
> -       /* Here we can't get an ENOMEM (because entries were
> -        * previously allocated) But let's check for errors
> -        * (XArray implementation can be changed in the future)
> -        */
> -       WARN_ON_ONCE(xas_error(&xas));
>  xa_locked:
>         xas_unlock_irq(&xas);
>  xa_unlocked:
> @@ -2259,6 +2237,7 @@ static int collapse_file(struct mm_struct *mm, unsigned long addr,
>         /* Join all the small entries into a single multi-index entry. */
>         xas_set_order(&xas, start, HPAGE_PMD_ORDER);
>         xas_store(&xas, hpage);
> +       WARN_ON_ONCE(xas_error(&xas));
>         xas_unlock_irq(&xas);
>
>         /*





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux