On 4 Mar 2025, at 6:49, Hugh Dickins wrote: > On Wed, 26 Feb 2025, Zi Yan wrote: > >> This is a preparation patch, both added functions are not used yet. >> >> The added __split_unmapped_folio() is able to split a folio with its >> mapping removed in two manners: 1) uniform split (the existing way), and >> 2) buddy allocator like split. >> >> The added __split_folio_to_order() can split a folio into any lower order. >> For uniform split, __split_unmapped_folio() calls it once to split the >> given folio to the new order. For buddy allocator split, >> __split_unmapped_folio() calls it (folio_order - new_order) times and each >> time splits the folio containing the given page to one lower order. >> >> Signed-off-by: Zi Yan <ziy@xxxxxxxxxx> > > Sorry, I'm tired and don't really want to be writing this yet, but the > migrate "hotfix" has tipped my hand, and I need to get this out to you > before more days pass. > > I'd been unable to complete even a single iteration of my "kernel builds > on huge tmpfs while swapping to SSD" testing during this current 6.14-rc > mm.git cycle (6.14-rc itself fine) - until the last week, when some > important fixes have come in, so I'm no longer getting I/O errors from > ext4-on-loop0-on-huge-tmpfs, and "Huh VM_FAULT_OOM leaked" warnings: good. > > But I still can't get beyond a few iterations, a few minutes: there's > some corruption of user data, which usually manifests as a kernel build > failing because fixdep couldn't find some truncated-on-the-left pathname. > > While it definitely bisected to your folio_split() series, it's quite > possible that you're merely exposing an existing bug to wider use. > > I've spent the last few days trying to track this down, but still not > succeeded: I'm still getting much the same corruption. But have been > folding in various fixes as I found them, even though they have not > solved the main problem at all. I'll return to trying to debug the > corruption "tomorrow". > > I think (might be wrong, I'm in a rush) my mods are all to this > "add two new (not yet used) functions for folio_split()" patch: > please merge them in if you agree. > > 1. From source inspection, it looks like a folio_set_order() was missed. Actually no. folio_set_order(folio, new_order) is called multiple times in the for loop above. It is duplicated but not missing. > > 2. Why is swapcache only checked when folio_test_anon? I can see that > you've just copied that over from the old __split_huge_page(), but > it seems wrong to me here and there - I guess a relic from before > shmem could swap out a huge page. Yes, it is a relic, but it is still right before I change another relic in __folio_split() or split_huge_page_to_list_to_order() from mainline, if (!mapping) { ret = -EBUSY; goto out; }. It excludes the shmem in swap cache case. I probably will leave it as is in my next folio_split() version to avoid adding more potential bugs, but will come back later in another patch. Best Regards, Yan, Zi