On Mon, 13 Mar 2023 10:08:38 -0700 Mike Kravetz <mike.kravetz@xxxxxxxxxx> wrote: > I suspect holding the lru lock when calling isolate_or_dissolve_huge_page was > not considered. However, I wonder if this can really happen in practice? > > Before the code below, there is this: > > /* > * Periodically drop the lock (if held) regardless of its > * contention, to give chance to IRQs. Abort completely if > * a fatal signal is pending. > */ > if (!(low_pfn % COMPACT_CLUSTER_MAX)) { > if (locked) { > unlock_page_lruvec_irqrestore(locked, flags); > locked = NULL; > } > ... > } > > It would seem that the pfn of a hugetlb page would always be a multiple of > COMPACT_CLUSTER_MAX so we would drop the lock. However, I am not sure if > that is ALWAYS true and would prefer something like the code you suggested. > > Did you actually see this deadlock in practice? Presumably the lack of lockdep reports about this tells us something?