Re: [PATCH RFC 1/9] memremap: add ZONE_DEVICE support for compound pages

Joao Martins <joao.m.martins@xxxxxxxxxx> · Wed, 10 Mar 2021 18:12:47 +0000

On 2/22/21 8:37 PM, Dan Williams wrote:
> On Mon, Feb 22, 2021 at 3:24 AM Joao Martins <joao.m.martins@xxxxxxxxxx> wrote:
>> On 2/20/21 1:43 AM, Dan Williams wrote:
>>> On Tue, Dec 8, 2020 at 9:59 PM John Hubbard <jhubbard@xxxxxxxxxx> wrote:
>>>> On 12/8/20 9:28 AM, Joao Martins wrote:
>> One thing to point out about altmap is that the degradation (in pinning and
>> unpining) we observed with struct page's in device memory, is no longer observed
>> once 1) we batch ref count updates as we move to compound pages 2) reusing
>> tail pages seems to lead to these struct pages staying more likely in cache
>> which perhaps contributes to dirtying a lot less cachelines.
> 
> True, it makes it more palatable to survive 'struct page' in PMEM, 

I want to retract for now what I said above wrt to the no degradation with
struct page in device comment. I was fooled by a bug on a patch later down
this series. Particular because I accidentally cleared PGMAP_ALTMAP_VALID when
unilaterally setting PGMAP_COMPOUND, which consequently lead to always
allocating struct pages from memory. No wonder the numbers were just as fast.
I am still confident that it's going to be faster and observe less degradation
in pinning/init. Init for now is worst-case 2x faster. But to be *as fast* struct
pages in memory might still be early to say.

The broken masking of the PGMAP_ALTMAP_VALID bit did hide one flaw, where
we don't support altmap for basepages on x86/mm and it apparently depends
on architectures to implement it (and a couple other issues). The vmemmap
allocation isn't the problem, so the previous comment in this thread that
altmap doesn't change much in the vmemmap_populate_compound_pages() is
still accurate.

The problem though resides on the freeing of vmemmap pagetables with
basepages *with altmap* (e.g. at dax-device teardown) which require arch
support. Doing it properly would mean making the altmap reserve smaller
(given fewer pages are allocated), and the ability for the altmap pfn
allocator to track references per pfn. But I think it deserves its own
separate patch series (probably almost just as big).

Perhaps for this set I can stick without altmap as you suggested, and
use hugepage vmemmap population (which wouldn't
lead to device memory savings) instead of reusing base pages . I would
still leave the compound page support logic as metadata representation
for > 4K @align, as I think that's the right thing to do. And then
a separate series onto improving altmap to leverage the metadata reduction
support as done with non-device struct pages.

Thoughts?

	Joao