The patch titled mm: compaction: avoid potential deadlock for readahead pages and direct compaction has been added to the -mm tree. Its filename is mm-vmscan-reclaim-order-0-and-use-compaction-instead-of-lumpy-reclaim-avoid-potential-deadlock-for-readahead-pages-and-direct-compaction.patch Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/SubmitChecklist when testing your code *** See http://userweb.kernel.org/~akpm/stuff/added-to-mm.txt to find out what to do about this The current -mm tree may be found at http://userweb.kernel.org/~akpm/mmotm/ ------------------------------------------------------ Subject: mm: compaction: avoid potential deadlock for readahead pages and direct compaction From: Mel Gorman <mel@xxxxxxxxx> Hugh Dickins reported that two instances of cp were locking up when running on ppc64 in a memory constrained environment. The deadlock appears to apply to readahead pages. When reading ahead, the pages are added locked to the LRU and queued for IO. The process is also inserting pages into the page cache and so is calling radix_preload and entering the page allocator. When SLUB is used, this can result in direct compaction finding the page that was just added to the LRU but still locked by the current process leading to deadlock. This patch avoids locking pages for migration that might already be locked by the current process. Ideally it would only apply for direct compaction but compaction does not set PF_MEMALLOC so there is no way currently of identifying a process in direct compaction. A process-flag could be added but is likely to be overkill. Signed-off-by: Mel Gorman <mel@xxxxxxxxx> Reported-by: Hugh Dickins <hughd@xxxxxxxxxx> Cc: Andrea Arcangeli <aarcange@xxxxxxxxxx> Cc: Andy Whitcroft <apw@xxxxxxxxxxxx> Cc: Johannes Weiner <hannes@xxxxxxxxxxx> Cc: KOSAKI Motohiro <kosaki.motohiro@xxxxxxxxxxxxxx> Cc: Mel Gorman <mel@xxxxxxxxx> Cc: Rik van Riel <riel@xxxxxxxxxx> Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> --- mm/migrate.c | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff -puN mm/migrate.c~mm-vmscan-reclaim-order-0-and-use-compaction-instead-of-lumpy-reclaim-avoid-potential-deadlock-for-readahead-pages-and-direct-compaction mm/migrate.c --- a/mm/migrate.c~mm-vmscan-reclaim-order-0-and-use-compaction-instead-of-lumpy-reclaim-avoid-potential-deadlock-for-readahead-pages-and-direct-compaction +++ a/mm/migrate.c @@ -639,6 +639,19 @@ static int unmap_and_move(new_page_t get if (!trylock_page(page)) { if (!force) goto move_newpage; + + /* + * During page readahead, pages are added locked to the LRU + * and marked up to date when the IO completes. As part of + * readahead, a process can reload the radix tree, enter + * page reclamation, direct compaction and migrate page. + * Hence, during readahead, a process can end up trying to + * lock the same page twice leading to deadlock. Avoid this + * situation. + */ + if (PageMappedToDisk(page) && !PageUptodate(page)) + goto move_newpage; + lock_page(page); } _ Patches currently in -mm which might be from mel@xxxxxxxxx are linux-next.patch mm-page-allocator-adjust-the-per-cpu-counter-threshold-when-memory-is-low.patch mm-vmstat-use-a-single-setter-function-and-callback-for-adjusting-percpu-thresholds.patch mm-vmstat-use-a-single-setter-function-and-callback-for-adjusting-percpu-thresholds-fix.patch mm-vmstat-use-a-single-setter-function-and-callback-for-adjusting-percpu-thresholds-update.patch mm-vmstat-use-a-single-setter-function-and-callback-for-adjusting-percpu-thresholds-fix-set_pgdat_percpu_threshold-dont-use-for_each_online_cpu.patch writeback-io-less-balance_dirty_pages.patch writeback-consolidate-variable-names-in-balance_dirty_pages.patch writeback-per-task-rate-limit-on-balance_dirty_pages.patch writeback-per-task-rate-limit-on-balance_dirty_pages-fix.patch writeback-prevent-duplicate-balance_dirty_pages_ratelimited-calls.patch writeback-account-per-bdi-accumulated-written-pages.patch writeback-bdi-write-bandwidth-estimation.patch writeback-bdi-write-bandwidth-estimation-fix.patch writeback-show-bdi-write-bandwidth-in-debugfs.patch writeback-quit-throttling-when-bdi-dirty-pages-dropped-low.patch writeback-reduce-per-bdi-dirty-threshold-ramp-up-time.patch writeback-make-reasonable-gap-between-the-dirty-background-thresholds.patch writeback-scale-down-max-throttle-bandwidth-on-concurrent-dirtiers.patch writeback-add-trace-event-for-balance_dirty_pages.patch writeback-make-nr_to_write-a-per-file-limit.patch writeback-make-nr_to_write-a-per-file-limit-fix.patch vmscan-factor-out-kswapd-sleeping-logic-from-kswapd.patch mm-compaction-add-trace-events-for-memory-compaction-activity.patch mm-vmscan-convert-lumpy_mode-into-a-bitmask.patch mm-vmscan-reclaim-order-0-and-use-compaction-instead-of-lumpy-reclaim.patch mm-vmscan-reclaim-order-0-and-use-compaction-instead-of-lumpy-reclaim-fix.patch mm-vmscan-reclaim-order-0-and-use-compaction-instead-of-lumpy-reclaim-avoid-potential-deadlock-for-readahead-pages-and-direct-compaction.patch mm-migration-allow-migration-to-operate-asynchronously-and-avoid-synchronous-compaction-in-the-faster-path.patch mm-migration-allow-migration-to-operate-asynchronously-and-avoid-synchronous-compaction-in-the-faster-path-fix.patch mm-migration-cleanup-migrate_pages-api-by-matching-types-for-offlining-and-sync.patch mm-compaction-perform-a-faster-migration-scan-when-migrating-asynchronously.patch mm-vmscan-rename-lumpy_mode-to-reclaim_mode.patch mm-vmscan-rename-lumpy_mode-to-reclaim_mode-fix.patch mm-kswapd-stop-high-order-balancing-when-any-suitable-zone-is-balanced.patch mm-kswapd-keep-kswapd-awake-for-high-order-allocations-until-a-percentage-of-the-node-is-balanced.patch mm-kswapd-use-the-order-that-kswapd-was-reclaiming-at-for-sleeping_prematurely.patch mm-kswapd-reset-kswapd_max_order-and-classzone_idx-after-reading.patch mm-kswapd-treat-zone-all_unreclaimable-in-sleeping_prematurely-similar-to-balance_pgdat.patch mm-kswapd-use-the-classzone-idx-that-kswapd-was-using-for-sleeping_prematurely.patch thp-fix-bad_page-to-show-the-real-reason-the-page-is-bad.patch thp-mm-define-madv_hugepage.patch thp-alter-compound-get_page-put_page.patch thp-update-futex-compound-knowledge.patch thp-clear-compound-mapping.patch thp-add-native_set_pmd_at.patch thp-add-pmd-paravirt-ops.patch thp-no-paravirt-version-of-pmd-ops.patch thp-export-maybe_mkwrite.patch thp-comment-reminder-in-destroy_compound_page.patch thp-config_transparent_hugepage.patch thp-config_transparent_hugepage-fix.patch thp-special-pmd_trans_-functions.patch thp-add-pmd-mangling-generic-functions.patch thp-add-pmd-mangling-generic-functions-fix-pgtableh-build-for-um.patch thp-bail-out-gup_fast-on-splitting-pmd.patch thp-pte-alloc-trans-splitting.patch thp-pte-alloc-trans-splitting-fix.patch thp-clear-page-compound.patch thp-add-pmd_huge_pte-to-mm_struct.patch thp-split_huge_page_mm-vma.patch thp-split_huge_page-paging.patch thp-clear_copy_huge_page.patch thp-_gfp_no_kswapd.patch thp-dont-alloc-harder-for-gfp-nomemalloc-even-if-nowait.patch thp-split_huge_page-anon_vma-ordering-dependency.patch thp-verify-pmd_trans_huge-isnt-leaking.patch thp-add-pagetranscompound.patch thp-skip-transhuge-pages-in-ksm-for-now.patch thp-enable-direct-defrag.patch thp-select-config_compaction-if-transparent_hugepage-enabled.patch thp-transhuge-isolate_migratepages.patch thp-disable-transparent-hugepages-by-default-on-small-systems.patch mm-migration-use-rcu_dereference_protected-when-dereferencing-the-radix-tree-slot-during-file-page-migration.patch mm-migration-use-rcu_dereference_protected-when-dereferencing-the-radix-tree-slot-during-file-page-migration-fix.patch add-debugging-aid-for-memory-initialisation-problems.patch -- To unsubscribe from this list: send the line "unsubscribe mm-commits" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html