Re: [PATCH v8 1/2] mm/khugepaged: recover from poisoned anonymous memory

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Dec 1, 2022 at 6:25 PM Kefeng Wang <wangkefeng.wang@xxxxxxxxxx> wrote:
>
>
> On 2022/12/2 7:09, Andrew Morton wrote:
> > On Wed, 30 Nov 2022 16:59:30 -0800 Jiaqi Yan <jiaqiyan@xxxxxxxxxx> wrote:
> >
> >> Make __collapse_huge_page_copy return whether copying anonymous pages
> >> succeeded, and make collapse_huge_page handle the return status.
> >>
> >> Break existing PTE scan loop into two for-loops. The first loop copies
> >> source pages into target huge page, and can fail gracefully when running
> >> into memory errors in source pages. If copying all pages succeeds, the
> >> second loop releases and clears up these normal pages. Otherwise, the
> >> second loop rolls back the page table and page states by:
> >> - re-establishing the original PTEs-to-PMD connection.
> >> - releasing source pages back to their LRU list.
> >>
> >> Tested manually:
> >> 0. Enable khugepaged on system under test.
> >> 1. Start a two-thread application. Each thread allocates a chunk of
> >>     non-huge anonymous memory buffer.
> >> 2. Pick 4 random buffer locations (2 in each thread) and inject
> >>     uncorrectable memory errors at corresponding physical addresses.
> >> 3. Signal both threads to make their memory buffer collapsible, i.e.
> >>     calling madvise(MADV_HUGEPAGE).
> >> 4. Wait and check kernel log: khugepaged is able to recover from poisoned
> >>     pages and skips collapsing them.
> >> 5. Signal both threads to inspect their buffer contents and make sure no
> >>     data corruption.
> > Looks like a nice patchset.  I'd like to give it a run in linux-next
> > but we're at -rc7 and we have no review/ack tags.  So it should be a
> > post-6.2-rc1 thing.
> >
> > I have a quibble.
> >
> >> --- a/include/linux/highmem.h
> >> +++ b/include/linux/highmem.h
> >> @@ -361,6 +361,27 @@ static inline void copy_highpage(struct page *to, struct page *from)
> >>
> >>   #endif
> >>
> >> +/*
> >> + * Machine check exception handled version of copy_highpage. Return number
> >> + * of bytes not copied if there was an exception; otherwise 0 for success.
> >> + * Note handling #MC requires arch opt-in.
> >> + */
> >> +static inline int copy_mc_highpage(struct page *to, struct page *from)
> >> +{
> >> +    char *vfrom, *vto;
> >> +    unsigned long ret;
> >> +
> >> +    vfrom = kmap_local_page(from);
> >> +    vto = kmap_local_page(to);
> >> +    ret = copy_mc_to_kernel(vto, vfrom, PAGE_SIZE);
> >> +    if (ret == 0)
> >> +            kmsan_copy_page_meta(to, from);
> >> +    kunmap_local(vto);
> >> +    kunmap_local(vfrom);
> >> +
> >> +    return ret;
> >> +}
> > Why inlined?  It's large, it's slow, it's called only from
> > khugepaged.c.  A regular out-of-line function which is static to
> > khugepaged.c seems more appropriate?
>
> There is a similar function copy_mc_user_highpage(),  could we reuse
> it , see commit a873dfe1032a mm, hwpoison: try to recover from copy-on
> write faults
>
>

To Kefeng: As I explained in v7, besides `to` and `from` pages,
copy_mc_user_highpage takes `struct vm_area_struct *vma` and `vaddr`.
While it fits __collapse_huge_page_copy, it doesn't really fit well
for collapse_file (needed for the 2nd commit). When Shi Yang reviewed
my patches, we agreed that we should borrow this opportunity to unify
the copying routines in khugepaged.c (for both file-backed and anon
memory), and copy_highpage fits both (alternatively we can use
copy_user_highpage and passing vaddr=null and vma=null, but I don't
like that). So I choose to make copy_highpage to be MC recoverable.
Does this make sense to you?

To Andrew: I think it is a reasonable "quibble". I will prepare the
update in v9 while waiting for more reviews on v8 if there is.




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux