Re: [syzbot] INFO: task hung in __unmap_and_move (4)

Matthew Wilcox <willy@xxxxxxxxxxxxx> · Thu, 29 Dec 2022 22:01:42 +0000

On Thu, Dec 29, 2022 at 01:48:42AM -0800, syzbot wrote:
> INFO: task kcompactd1:32 blocked for more than 143 seconds.
>       Not tainted 6.1.0-syzkaller-14594-g72a85e2b0a1e #0
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> task:kcompactd1      state:D stack:26360 pid:32    ppid:2      flags:0x00004000
> Call Trace:
>  <TASK>
>  context_switch kernel/sched/core.c:5244 [inline]
>  __schedule+0x9d1/0xe40 kernel/sched/core.c:6555
>  schedule+0xcb/0x190 kernel/sched/core.c:6631
>  io_schedule+0x83/0x100 kernel/sched/core.c:8811
>  folio_wait_bit_common+0x8ca/0x1390 mm/filemap.c:1297
>  folio_lock include/linux/pagemap.h:938 [inline]
>  __unmap_and_move+0x835/0x12a0 mm/migrate.c:1040
>  unmap_and_move+0x28f/0xd80 mm/migrate.c:1194
>  migrate_pages+0x50f/0x14d0 mm/migrate.c:1477
>  compact_zone+0x2893/0x37a0 mm/compaction.c:2413
>  proactive_compact_node mm/compaction.c:2665 [inline]
>  kcompactd+0x1b46/0x2750 mm/compaction.c:2975

OK, so kcompactd is trying to compact a zone, has called folio_lock()
and whoever has the folio locked has had it locked for 143 seconds.
That seems like quite a long time.  Probably it is locked waiting
for I/O.

> NMI backtrace for cpu 1
[...]
>  lock_release+0x81/0x870 kernel/locking/lockdep.c:5679
>  rcu_read_unlock include/linux/rcupdate.h:797 [inline]
>  folio_evictable+0x1df/0x2d0 mm/internal.h:140
>  move_folios_to_lru+0x324/0x25c0 mm/vmscan.c:2413
>  shrink_inactive_list+0x60b/0xca0 mm/vmscan.c:2529
>  shrink_list mm/vmscan.c:2767 [inline]
>  shrink_lruvec+0x449/0xc50 mm/vmscan.c:5951
>  shrink_node_memcgs+0x35c/0x780 mm/vmscan.c:6138
>  shrink_node+0x299/0x1050 mm/vmscan.c:6169
>  shrink_zones+0x4fb/0xc40 mm/vmscan.c:6407
>  do_try_to_free_pages+0x215/0xcd0 mm/vmscan.c:6469
>  try_to_free_pages+0x3e8/0xc60 mm/vmscan.c:6704
>  __perform_reclaim mm/page_alloc.c:4750 [inline]
>  __alloc_pages_direct_reclaim mm/page_alloc.c:4772 [inline]
>  __alloc_pages_slowpath+0xd5c/0x2120 mm/page_alloc.c:5178
>  __alloc_pages+0x3d4/0x560 mm/page_alloc.c:5562
>  folio_alloc+0x1a/0x50 mm/mempolicy.c:2296
>  filemap_alloc_folio+0xca/0x2c0 mm/filemap.c:972
>  page_cache_ra_unbounded+0x212/0x820 mm/readahead.c:248
>  do_sync_mmap_readahead+0x786/0x950 mm/filemap.c:3062
>  filemap_fault+0x38d/0x1060 mm/filemap.c:3154

So dhcpd has taken a page fault, missed in the page cache, called
readahead, is presumably partway through the readahead (ie has folios
locked in the page cache, not uptodate and I/O hasn't been submitted
on them).  It's trying to allocate pages, but has fallen into reclaim.
It's trying to shrink the inactive list at this point, but is not
having much luck.  For one thing, it's a GFP_NOFS allocation.  So
it was probably the one who woke kcompactd.

Should readahead be trying less hard to allocate memory?  It's already
using __GFP_NORETRY.