Re: [PATCH v9 2/2] mm/khugepaged: recover from poisoned file-backed memory

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Dec 05, 2022 at 03:40:59PM -0800, Jiaqi Yan wrote:
> Make collapse_file roll back when copying pages failed. More concretely:
> - extract copying operations into a separate loop
> - postpone the updates for nr_none until both scanning and copying
>   succeeded
> - postpone joining small xarray entries until both scanning and copying
>   succeeded
> - postpone the update operations to NR_XXX_THPS until both scanning and
>   copying succeeded
> - for non-SHMEM file, roll back filemap_nr_thps_inc if scan succeeded but
>   copying failed
> 
> Tested manually:
> 0. Enable khugepaged on system under test. Mount tmpfs at /mnt/ramdisk.
> 1. Start a two-thread application. Each thread allocates a chunk of
>    non-huge memory buffer from /mnt/ramdisk.
> 2. Pick 4 random buffer address (2 in each thread) and inject
>    uncorrectable memory errors at physical addresses.
> 3. Signal both threads to make their memory buffer collapsible, i.e.
>    calling madvise(MADV_HUGEPAGE).
> 4. Wait and then check kernel log: khugepaged is able to recover from
>    poisoned pages by skipping them.
> 5. Signal both threads to inspect their buffer contents and make sure no
>    data corruption.
> 
> Signed-off-by: Jiaqi Yan <jiaqiyan@xxxxxxxxxx>

Okay, looks sane.

Acked-by: Kirill A. Shutemov <kirill.shutemov@xxxxxxxxxxxxxxx>

-- 
  Kiryl Shutsemau / Kirill A. Shutemov




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux