On Mon, Sep 07, 2020 at 02:33:24PM +0300, Kirill A. Shutemov wrote: > On Fri, Sep 04, 2020 at 04:37:24AM +0100, Matthew Wilcox wrote: > > Kirill, do I have the handling of split_huge_page() failure correct? > > It seems reasonable to me -- unlock the page and drop the reference, > > hoping that somebody else will not have a reference to the page by the > > next time we try to split it. Or they will split it for us. There's a > > livelock opportunity here, but I'm not sure it's worse than the one in > > a holepunch scenario. > > The worst case scenario is when the page is referenced (directly or > indirectly) by the caller. It this case we would end up with endless loop. > I'm not sure how we can guarantee that this will never happen. I don't see a way for that to happen at the moment. We're pretty careful not to take references on multiple pages at once in these paths. I've fixed the truncate paths to only take one reference per THP too. I was thinking that if livelock becomes a problem, we could (ab)use the THP destructor mechanism somewhat like this: Add [TRANSHUGE_PAGE_SPLIT] = split_transhuge_page, to the compound_page_dtors array. New function split_huge_page_wait() which, if !can_split_huge_page() first checks if the dtor is already set to TRANSHUGE_PAGE_SPLIT. If so, it returns to its caller, reporting failure (so that it will put its reference to the page). Then it sets the dtor to TRANSHUGE_PAGE_SPLIT and sets page refcount to 1. It goes to sleep on the page. split_transhuge_page() first has to check if the refcount went to 0 due to mapcount being reduced. If so, it resets the refcount to 1 and returns to the caller. If not, it freezes the page and wakes the task above which is sleeping in split_huge_page_wait(). It's complicated and I don't love it. But it might solve livelock, should we need to do it. It wouldn't prevent us from an indefinite wait if the caller of split_huge_page_wait() has more than one reference to this page. That's better than a livelock though.