Re: [PATCH v3 00/15] HWPOISON: soft offline rework

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 2020-06-30 at 01:08 -0400, Qian Cai wrote:
> On Wed, Jun 24, 2020 at 03:01:22PM +0000, nao.horiguchi@xxxxxxxxx
> wrote:
> > I rebased soft-offline rework patchset [1][2] onto the latest
> > mmotm.  The
> > rebasing required some non-trivial changes to adjust, but mainly
> > that was
> > straightforward.  I confirmed that the reported problem doesn't
> > reproduce on
> > compaction after soft offline.  For more precise description of the
> > problem
> > and the motivation of this patchset, please see [2].
> > 
> > I think that the following two patches in v2 are better to be done
> > with
> > separate work of hard-offline rework, so it's not included in this
> > series.
> > 
> >   - mm,hwpoison: Take pages off the buddy when hard-offlining
> >   - mm/hwpoison-inject: Rip off duplicated checks
> > 
> > These two are not directly related to the reported problem, so they
> > seems
> > not urgent.  And the first one breaks num_poisoned_pages counting
> > in some
> > testcases, and The second patch needs more consideration about
> > commented point.
> > 
> > Any comment/suggestion/help would be appreciated.
> 
> Even after applied the compling fix,
> 
> https://lore.kernel.org/linux-mm/20200628065409.GA546944@u2004/
> 
> madvise(MADV_SOFT_OFFLINE) will fail with EIO with hugetlb where it
> would succeed without this series. Steps:
> 
> # git clone https://github.com/cailca/linux-mm
> # cd linux-mm; make
> # ./random 1 (Need at least two NUMA memory nodes)
>  start: migrate_huge_offline
> - use NUMA nodes 0,4.
> - mmap and free 8388608 bytes hugepages on node 0
> - mmap and free 8388608 bytes hugepages on node 4
> madvise: Input/output error

I think I know why.
It's been a while since I took a look, but I compared the posted
patchset with my newest patchset I had ready and I saw I made some
changes with regard of hugetlb pages.

I will be taking a look, although it might be better to re-post the
patchset instead of adding a fix on top since the changes are a bit
substantial.

Thanks for reporting.

-- 
Oscar Salvador
SUSE L3




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux