Re: [PATCH 00/45] hugetlb pagewalk unification

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 11.07.24 06:48, Oscar Salvador wrote:
On Thu, Jul 11, 2024 at 02:15:38AM +0200, David Hildenbrand wrote:
(as a side note, cont-pte/cont-pmd should primarily be a hint from arch code
on how many entries we can batch, like we do in folio_pte_batch(); point is
that we want to batch also on architectures where we don't have such bits,
and prepare for architectures that implement various sizes of batching;
IMHO, having cont-pte/cont-pmd checks in common code is likely the wrong
approach. Again, folio_pte_batch() is where we tackled the problem
differently from the THP perspective)

I must say I did not check folio_pte_batch() and I am totally ignorant
of what/how it does things.
I will have a look.

I have an idea for a better page table walker API that would try batching
most entries (under one PTL), and walkers can just register for the types
they want. Hoping I will find some time to at least scetch the user
interface soon.

That doesn't mean that this should block your work, but the
cont-pte/cont/pmd hugetlb stuff is really nasty to handle here, and I don't
particularly like where this is going.

Ok, let me take a step back then.
Previous versions of that RFC did not handle cont-{pte-pmd} wide in the
open, so let me go back to the drawing board and come up with something
that does not fiddle with cont- stuff in that way.

I might post here a small diff just to see if we are on the same page.

As usual, thanks a lot for your comments David!

Feel free to reach out to discuss ways forward. I think we should

(a) move to the automatic cont-pte setting as done for THPs via
     set_ptes().
(b) Batching PTE updates at all relevant places, so we get no change in
     behavior: cont-pte bit will remain set.
(c) Likely remove the use of cont-pte bits in hugetlb code for anything
     that is not a present folio (i.e., where automatic cont-pte bit
     setting would never set it). Migration entries might require
     thought (we can easily batch to achieve the same thing, but the
     behavior of hugetlb likely differs to the generic way of handling
     migration entries on multiple ptes: reference the folio vs.
     the respective subpages of the folio).

Uhm, I see, but I am bit confused.
Although related, this seems orthogonal to this series and more like for
a next-thing to do, right?

Well, yes and no. The thing is, that the cont-pte/cont-pmd stuff is not as easy to handle like the PMD/PUD stuff, and sorting that out sounds like some "pain". That's the ugly part of hugetlb, where it's simply ... quite different :(


It is true that this series tries to handle cont-{pmd,pte} in the
pagewalk api for hugetlb vmas, but in order to raise less eye brows I
can come up with a way not to do that for now, so we do not fiddle with
cont-stuff in this series.


Or am I misunderstanding you?

I can answer once I know more details about the approach you have in mind :)

--
Cheers,

David / dhildenb





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux