Re: [PATCH v9 2/8] mm/huge_memory: add two new (not yet used) functions for folio_split()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 06.03.25 17:27, Zi Yan wrote:
On 6 Mar 2025, at 4:19, David Hildenbrand wrote:

On 05.03.25 22:08, Zi Yan wrote:
On 5 Mar 2025, at 15:50, Hugh Dickins wrote:

On Wed, 5 Mar 2025, Zi Yan wrote:
On 4 Mar 2025, at 6:49, Hugh Dickins wrote:

I think (might be wrong, I'm in a rush) my mods are all to this
"add two new (not yet used) functions for folio_split()" patch:
please merge them in if you agree.

1. From source inspection, it looks like a folio_set_order() was missed.

Actually no. folio_set_order(folio, new_order) is called multiple times
in the for loop above. It is duplicated but not missing.

I was about to disagree with you, when at last I saw that, yes,
it is doing that on "folio" at the time of setting up "new_folio".

That is confusing: in all other respects, that loop is reading folio
to set up new_folio.  Do you have a reason for doing it there?

No. I agree your fix is better. Just point out folio_set_order() should
not trigger a bug.


The transient "nested folio" situation is anomalous either way.
I'd certainly prefer it to be done at the point where you
ClearPageCompound when !new_order; but if you think there's an issue
with racing isolate_migratepages_block() or something like that, which
your current placement handles better, then please add a line of comment
both where you do it and where I expected to find it - thanks.

Sure. I will use your patch unless I find some racing issue.


(Historically, there was quite a lot of difficulty in getting the order
of events in __split_huge_page_tail() to be safe: I wonder whether we
shall see a crop of new weird bugs from these changes. I note that your
loops advance forwards, whereas the old ones went backwards: but I don't
have anything to say you're wrong.  I think it's mainly a matter of how
the first tail or two gets handled: which might be why you want to
folio_set_order(folio, new_order) at the earliest opportunity.)

I am worried about that too. In addition, in __split_huge_page_tail(),
page refcount is restored right after new tail folio split is done,
whereas I needed to delay them until all new after-split folios
are done, since non-uniform split is iterative and only the after-split
folios NOT containing the split_at page will be released. These
folios are locked and frozen after __split_folio_to_order() like
the original folio. Maybe because there are more such locked frozen
folios than before?

What's the general concern here?

A frozen folio cannot be referenced and consequently not trusted. For example, if we want to speculatively lookup a folio in the pagecache and find it to be frozen, we'll have to spin (retry) until we find a folio that is unfrozen.

While a folio has a refcount of 0, there are no guarantees. It could change its size, it could be freed + reallocated (changed mapping etc) ...

So whoever wants to grab a speculative reference -- using folio_try_get() -- must re-verify folio properties after grabbing the speculative reference succeeded. Including whether it is small/large, number of pages, mapping, ...

The important part is to unfreeze a folio only once it was fully prepared (e.g., order set, compound pages links to head set up etc).

I am not sure if the sequence in which you process folios during a split matters here when doing a split: only that, whatever new folio  is unfrozen, is properly initialized.

Got it. Thanks for the confirmation.

My worry came from that after I rebased on mm-everything-2025-03-05-03-54,
which does not have folio_split() patches, I see a crash saying a buddy
page is hit in __split_folio_to_order(). It turns out that I did not
add the changes from your “mm: let _folio_nr_pages overlay memcg_data in
first tail page” patch. With that fixed, no crash is observed so far.

Ah that makes sense. Yes, it must look like in my original v3 that was based on your patches. (now it's the other way around :) )

--
Cheers,

David / dhildenb





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux