On Tue, Jun 01, 2021 at 09:55:56AM -0700, Hugh Dickins wrote: > On Wed, 2 Jun 2021, Xu Yu wrote: > > > We notice that hung task happens in a conner but practical scenario when > > CONFIG_PREEMPT_NONE is enabled, as follows. > > > > Process 0 Process 1 Process 2..Inf > > split_huge_page_to_list > > unmap_page > > split_huge_pmd_address > > __migration_entry_wait(head) > > __migration_entry_wait(tail) > > remap_page (roll back) > > remove_migration_ptes > > rmap_walk_anon > > cond_resched > > > > Where __migration_entry_wait(tail) is occurred in kernel space, e.g., > > copy_to_user, which will immediately fault again without rescheduling, > > and thus occupy the cpu fully. > > > > When there are too many processes performing __migration_entry_wait on > > tail page, remap_page will never be done after cond_resched. > > > > This relaxes __migration_entry_wait on tail page, thus gives remap_page > > a chance to complete. > > > > Signed-off-by: Gang Deng <gavin.dg@xxxxxxxxxxxxxxxxx> > > Signed-off-by: Xu Yu <xuyu@xxxxxxxxxxxxxxxxx> > > Well caught: you're absolutely right that there's a bug there. > But isn't cond_resched() just papering over the real bug, and > what it should do is a "page = compound_head(page);" before the > get_page_unless_zero()? How does that work out in your testing? You do realise you're strengthening my case for folios by suggesting that, don't you? ;-) I was going to suggest that it won't make any difference because the page reference count is frozen, but the freezing happens after the call to unmap_page(), so it may make a difference.