On 5 Mar 2025, at 15:50, Hugh Dickins wrote: > On Wed, 5 Mar 2025, Zi Yan wrote: >> On 4 Mar 2025, at 6:49, Hugh Dickins wrote: >>> >>> I think (might be wrong, I'm in a rush) my mods are all to this >>> "add two new (not yet used) functions for folio_split()" patch: >>> please merge them in if you agree. >>> >>> 1. From source inspection, it looks like a folio_set_order() was missed. >> >> Actually no. folio_set_order(folio, new_order) is called multiple times >> in the for loop above. It is duplicated but not missing. > > I was about to disagree with you, when at last I saw that, yes, > it is doing that on "folio" at the time of setting up "new_folio". > > That is confusing: in all other respects, that loop is reading folio > to set up new_folio. Do you have a reason for doing it there? No. I agree your fix is better. Just point out folio_set_order() should not trigger a bug. > > The transient "nested folio" situation is anomalous either way. > I'd certainly prefer it to be done at the point where you > ClearPageCompound when !new_order; but if you think there's an issue > with racing isolate_migratepages_block() or something like that, which > your current placement handles better, then please add a line of comment > both where you do it and where I expected to find it - thanks. Sure. I will use your patch unless I find some racing issue. > > (Historically, there was quite a lot of difficulty in getting the order > of events in __split_huge_page_tail() to be safe: I wonder whether we > shall see a crop of new weird bugs from these changes. I note that your > loops advance forwards, whereas the old ones went backwards: but I don't > have anything to say you're wrong. I think it's mainly a matter of how > the first tail or two gets handled: which might be why you want to > folio_set_order(folio, new_order) at the earliest opportunity.) I am worried about that too. In addition, in __split_huge_page_tail(), page refcount is restored right after new tail folio split is done, whereas I needed to delay them until all new after-split folios are done, since non-uniform split is iterative and only the after-split folios NOT containing the split_at page will be released. These folios are locked and frozen after __split_folio_to_order() like the original folio. Maybe because there are more such locked frozen folios than before? >> >>> >>> 2. Why is swapcache only checked when folio_test_anon? I can see that >>> you've just copied that over from the old __split_huge_page(), but >>> it seems wrong to me here and there - I guess a relic from before >>> shmem could swap out a huge page. >> >> Yes, it is a relic, but it is still right before I change another relic >> in __folio_split() or split_huge_page_to_list_to_order() from mainline, >> if (!mapping) { ret = -EBUSY; goto out; }. It excludes the shmem in swap >> cache case. I probably will leave it as is in my next folio_split() version >> to avoid adding more potential bugs, but will come back later in another >> patch. > > I agree. The "Truncated ?" check. Good. But I do prefer that you use > that part of my patch, referring to mapping and swap_cache instead of anon, > rather than rely on that accident of what's done at the higher level. Definitely. Best Regards, Yan, Zi