On 01.03.24 07:51, Muchun Song wrote:
On Mar 1, 2024, at 12:29, Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote:
On Thu, Feb 29, 2024 at 05:37:23PM -0800, James Houghton wrote:
- It has HVO (which can hopefully be dropped in a memdesc world)
I've spent a bit of time thinking about this. I'll keep this x86-64
specific just to have concrete numbers.
Currently a 2MB htlb page without HVO occupies 64 * 512 = 32kB. With HVO,
it's reduced to 8kB. A 1GB htlb page occupies 64 * 256k = 8MB, with HVO,
it's still 8kB (right?)
Correct in the past. In the first version, HVO needs 2 pages (8k) for
vmemmap, however, it only needs only one page (4k) for it whatever the
huge page sizes (2MB or 1GB) now.
In a memdesc world, a 2MB page without HVO consumes 8 * 512 = 4kB.
There's no room for savings here. But a 1GB page takes 8 * 256k = 2MB.
There's still almost 2MB of savings to be had here, so I suspect some
people will still want it.
Agree. With 2MB pages, there is no savings with HVO, but it saves a lot
for 1GB huge pages.
Hopefully Yu Zhao's zone proposal lets us enable HVO for THP. At least
1GB ones.
Hopefully see it.
What's the biggest blocker regarding HVO+THP?
I can imagine the following two:
1) PMD->PTE remapping currently always has to work. Once we have PTE
mappings we would try writing per-page subpage + PAE, which we can't.
2) THP split + freeing would require allocating memory to remap the
vmemmap. Split can fail for other reasons already, but the freeing side
is nasty. But, if everything fails, we could have memory from the THP
itself when hadning it back to the buddy (suboptimal, but removes that
corner-case concern).
Likely there are other page flags (MCE) that also need care, but at
least for hugetlb we seem to have figured that out.
--
Cheers,
David / dhildenb