On 14. 01. 23, 5:11, Andrew Morton wrote:
The patch titled Subject: Revert "mm/compaction: fix set skip in fast_find_migrateblock" has been added to the -mm mm-hotfixes-unstable branch. Its filename is revert-mm-compaction-fix-set-skip-in-fast_find_migrateblock.patch This patch will shortly appear at https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patches/revert-mm-compaction-fix-set-skip-in-fast_find_migrateblock.patch This patch will later appear in the mm-hotfixes-unstable branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/process/submit-checklist.rst when testing your code *** The -mm tree is included into linux-next via the mm-everything branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm and is updated there every 2-3 working days ------------------------------------------------------ From: Vlastimil Babka <vbabka@xxxxxxx> Subject: Revert "mm/compaction: fix set skip in fast_find_migrateblock" Date: Fri, 13 Jan 2023 18:33:45 +0100 This reverts commit 7efc3b7261030da79001c00d92bc3392fd6c664c. We have got openSUSE reports (Link 1) for 6.1 kernel with khugepaged stalling CPU for long periods of time. Investigation of tracepoint data shows that compaction is stuck in repeating fast_find_migrateblock() based migrate page isolation, and then fails to migrate all isolated pages. Commit 7efc3b726103 ("mm/compaction: fix set skip in fast_find_migrateblock") was suspected as it was merged in 6.1 and in theory can indeed remove a termination condition for fast_find_migrateblock() under certain conditions, as it removes a place that always marks a scanned pageblock from being re-scanned. There are other such places, but those can be skipped under certain conditions, which seems to match the tracepoint data. Testing of revert also appears to have resolved the issue, thus revert the commit until a more robust solution for the original problem is developed.
It's also likely this will fix qemu stalls with 6.1 kernel reported in Link 2, but that is not yet confirmed.
Preliminary tests suggest this is also fixed: Tested-by: Jiri Slaby <jirislaby@xxxxxxxxxx>
Link: https://bugzilla.suse.com/show_bug.cgi?id=1206848 Link: https://lore.kernel.org/kvm/b8017e09-f336-3035-8344-c549086c2340@xxxxxxxxxx/ Link: https://lkml.kernel.org/r/20230113173345.9692-1-vbabka@xxxxxxx Fixes: 7efc3b726103 ("mm/compaction: fix set skip in fast_find_migrateblock") Cc: Chuyi Zhou <zhouchuyi@xxxxxxxxxxxxx> Cc: Jiri Slaby <jirislaby@xxxxxxxxxx> Cc: Maxim Levitsky <mlevitsk@xxxxxxxxxx> Cc: Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx> Cc: Michal Hocko <mhocko@xxxxxxxxxx> Cc: Paolo Bonzini <pbonzini@xxxxxxxxxx> Cc: Thorsten Leemhuis <regressions@xxxxxxxxxxxxx> Cc: <stable@xxxxxxxxxxxxxxx> Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> --- --- a/mm/compaction.c~revert-mm-compaction-fix-set-skip-in-fast_find_migrateblock +++ a/mm/compaction.c @@ -1839,6 +1839,7 @@ static unsigned long fast_find_migratebl pfn = cc->zone->zone_start_pfn; cc->fast_search_fail = 0; found_block = true; + set_pageblock_skip(freepage); break; } } _ Patches currently in -mm which might be from vbabka@xxxxxxx are revert-mm-compaction-fix-set-skip-in-fast_find_migrateblock.patch
thanks, -- js suse labs