On Tue, 2020-06-30 at 01:08 -0400, Qian Cai wrote: > On Wed, Jun 24, 2020 at 03:01:22PM +0000, nao.horiguchi@xxxxxxxxx > wrote: > > I rebased soft-offline rework patchset [1][2] onto the latest > > mmotm. The > > rebasing required some non-trivial changes to adjust, but mainly > > that was > > straightforward. I confirmed that the reported problem doesn't > > reproduce on > > compaction after soft offline. For more precise description of the > > problem > > and the motivation of this patchset, please see [2]. > > > > I think that the following two patches in v2 are better to be done > > with > > separate work of hard-offline rework, so it's not included in this > > series. > > > > - mm,hwpoison: Take pages off the buddy when hard-offlining > > - mm/hwpoison-inject: Rip off duplicated checks > > > > These two are not directly related to the reported problem, so they > > seems > > not urgent. And the first one breaks num_poisoned_pages counting > > in some > > testcases, and The second patch needs more consideration about > > commented point. > > > > Any comment/suggestion/help would be appreciated. > > Even after applied the compling fix, > > https://lore.kernel.org/linux-mm/20200628065409.GA546944@u2004/ > > madvise(MADV_SOFT_OFFLINE) will fail with EIO with hugetlb where it > would succeed without this series. Steps: > > # git clone https://github.com/cailca/linux-mm > # cd linux-mm; make > # ./random 1 (Need at least two NUMA memory nodes) > start: migrate_huge_offline > - use NUMA nodes 0,4. > - mmap and free 8388608 bytes hugepages on node 0 > - mmap and free 8388608 bytes hugepages on node 4 > madvise: Input/output error I think I know why. It's been a while since I took a look, but I compared the posted patchset with my newest patchset I had ready and I saw I made some changes with regard of hugetlb pages. I will be taking a look, although it might be better to re-post the patchset instead of adding a fix on top since the changes are a bit substantial. Thanks for reporting. -- Oscar Salvador SUSE L3