Re: + revert-mm-compaction-fix-set-skip-in-fast_find_migrateblock.patch added to mm-hotfixes-unstable branch

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 14. 01. 23, 5:11, Andrew Morton wrote:
The patch titled
      Subject: Revert "mm/compaction: fix set skip in fast_find_migrateblock"
has been added to the -mm mm-hotfixes-unstable branch.  Its filename is
      revert-mm-compaction-fix-set-skip-in-fast_find_migrateblock.patch

This patch will shortly appear at
      https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patches/revert-mm-compaction-fix-set-skip-in-fast_find_migrateblock.patch

This patch will later appear in the mm-hotfixes-unstable branch at
     git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Before you just go and hit "reply", please:
    a) Consider who else should be cc'ed
    b) Prefer to cc a suitable mailing list as well
    c) Ideally: find the original patch on the mailing list and do a
       reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days

------------------------------------------------------
From: Vlastimil Babka <vbabka@xxxxxxx>
Subject: Revert "mm/compaction: fix set skip in fast_find_migrateblock"
Date: Fri, 13 Jan 2023 18:33:45 +0100

This reverts commit 7efc3b7261030da79001c00d92bc3392fd6c664c.

We have got openSUSE reports (Link 1) for 6.1 kernel with khugepaged
stalling CPU for long periods of time.  Investigation of tracepoint data
shows that compaction is stuck in repeating fast_find_migrateblock() based
migrate page isolation, and then fails to migrate all isolated pages.
Commit 7efc3b726103 ("mm/compaction: fix set skip in
fast_find_migrateblock") was suspected as it was merged in 6.1 and in
theory can indeed remove a termination condition for
fast_find_migrateblock() under certain conditions, as it removes a place
that always marks a scanned pageblock from being re-scanned.  There are
other such places, but those can be skipped under certain conditions,
which seems to match the tracepoint data.

Testing of revert also appears to have resolved the issue, thus revert the
commit until a more robust solution for the original problem is developed.




It's also likely this will fix qemu stalls with 6.1 kernel reported in
Link 2, but that is not yet confirmed.

Preliminary tests suggest this is also fixed:
Tested-by: Jiri Slaby <jirislaby@xxxxxxxxxx>

Link: https://bugzilla.suse.com/show_bug.cgi?id=1206848
Link: https://lore.kernel.org/kvm/b8017e09-f336-3035-8344-c549086c2340@xxxxxxxxxx/
Link: https://lkml.kernel.org/r/20230113173345.9692-1-vbabka@xxxxxxx
Fixes: 7efc3b726103 ("mm/compaction: fix set skip in fast_find_migrateblock")
Cc: Chuyi Zhou <zhouchuyi@xxxxxxxxxxxxx>
Cc: Jiri Slaby <jirislaby@xxxxxxxxxx>
Cc: Maxim Levitsky <mlevitsk@xxxxxxxxxx>
Cc: Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx>
Cc: Michal Hocko <mhocko@xxxxxxxxxx>
Cc: Paolo Bonzini <pbonzini@xxxxxxxxxx>
Cc: Thorsten Leemhuis <regressions@xxxxxxxxxxxxx>
Cc: <stable@xxxxxxxxxxxxxxx>
Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
---


--- a/mm/compaction.c~revert-mm-compaction-fix-set-skip-in-fast_find_migrateblock
+++ a/mm/compaction.c
@@ -1839,6 +1839,7 @@ static unsigned long fast_find_migratebl
  					pfn = cc->zone->zone_start_pfn;
  				cc->fast_search_fail = 0;
  				found_block = true;
+				set_pageblock_skip(freepage);
  				break;
  			}
  		}
_

Patches currently in -mm which might be from vbabka@xxxxxxx are

revert-mm-compaction-fix-set-skip-in-fast_find_migrateblock.patch


thanks,
--
js
suse labs




[Index of Archives]     [Kernel Archive]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]

  Powered by Linux