[CC Mel and Vlastimil] On Thu 09-01-20 14:56:46, Cong Wang wrote: > We observed kcompactd hung at __lock_page(): > > INFO: task kcompactd0:57 blocked for more than 120 seconds. > Not tainted 4.19.56.x86_64 #1 > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > kcompactd0 D 0 57 2 0x80000000 > Call Trace: > ? __schedule+0x236/0x860 > schedule+0x28/0x80 > io_schedule+0x12/0x40 > __lock_page+0xf9/0x120 > ? page_cache_tree_insert+0xb0/0xb0 > ? update_pageblock_skip+0xb0/0xb0 > migrate_pages+0x88c/0xb90 > ? isolate_freepages_block+0x3b0/0x3b0 > compact_zone+0x5f1/0x870 > kcompactd_do_work+0x130/0x2c0 > ? __switch_to_asm+0x35/0x70 > ? __switch_to_asm+0x41/0x70 > ? kcompactd_do_work+0x2c0/0x2c0 > ? kcompactd+0x73/0x180 > kcompactd+0x73/0x180 > ? finish_wait+0x80/0x80 > kthread+0x113/0x130 > ? kthread_create_worker_on_cpu+0x50/0x50 > ret_from_fork+0x35/0x40 > > which faddr2line maps to: > > migrate_pages+0x88c/0xb90: > lock_page at include/linux/pagemap.h:483 > (inlined by) __unmap_and_move at mm/migrate.c:1024 > (inlined by) unmap_and_move at mm/migrate.c:1189 > (inlined by) migrate_pages at mm/migrate.c:1419 > > Sometimes kcompactd eventually got out of this situation, sometimes not. What does this mean exactly? Who is holding the page lock? > I think for memory compaction, it is a best effort to migrate the pages, > so it doesn't have to wait for I/O to complete. It is fine to call > trylock_page() here, which is pretty much similar to > buffer_migrate_lock_buffers(). > > Given MIGRATE_SYNC_LIGHT is used on compaction path, just relax the > check for it. The exact definition of MIGRATE_SYNC_LIGHT is a bit fuzzy but bailing out on the page lock sounds too early to me. So far we have tried to block on everything but the writeback which can take a lot of time. > Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> > Cc: Michal Hocko <mhocko@xxxxxxxx> > Cc: linux-mm@xxxxxxxxx > Signed-off-by: Cong Wang <xiyou.wangcong@xxxxxxxxx> > --- > mm/migrate.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/mm/migrate.c b/mm/migrate.c > index 86873b6f38a7..df60026779d2 100644 > --- a/mm/migrate.c > +++ b/mm/migrate.c > @@ -1010,7 +1010,8 @@ static int __unmap_and_move(struct page *page, struct page *newpage, > bool is_lru = !__PageMovable(page); > > if (!trylock_page(page)) { > - if (!force || mode == MIGRATE_ASYNC) > + if (!force || mode == MIGRATE_ASYNC > + || mode == MIGRATE_SYNC_LIGHT) > goto out; > > /* > -- > 2.21.1 > -- Michal Hocko SUSE Labs