Re: [PATCH v1 04/11] mm/memremap: add ZONE_DEVICE support for compound pages

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 6/7/21 9:17 PM, Dan Williams wrote:
> On Tue, May 18, 2021 at 10:28 AM Joao Martins <joao.m.martins@xxxxxxxxxx> wrote:
>>
>> On 5/5/21 11:36 PM, Joao Martins wrote:
>>> On 5/5/21 11:20 PM, Dan Williams wrote:
>>>> On Wed, May 5, 2021 at 12:50 PM Joao Martins <joao.m.martins@xxxxxxxxxx> wrote:
>>>>> On 5/5/21 7:44 PM, Dan Williams wrote:
>>>>>> On Thu, Mar 25, 2021 at 4:10 PM Joao Martins <joao.m.martins@xxxxxxxxxx> wrote:
>>>>>>> diff --git a/include/linux/memremap.h b/include/linux/memremap.h
>>>>>>> index b46f63dcaed3..bb28d82dda5e 100644
>>>>>>> --- a/include/linux/memremap.h
>>>>>>> +++ b/include/linux/memremap.h
>>>>>>> @@ -114,6 +114,7 @@ struct dev_pagemap {
>>>>>>>         struct completion done;
>>>>>>>         enum memory_type type;
>>>>>>>         unsigned int flags;
>>>>>>> +       unsigned long align;
>>>>>>
>>>>>> I think this wants some kernel-doc above to indicate that non-zero
>>>>>> means "use compound pages with tail-page dedup" and zero / PAGE_SIZE
>>>>>> means "use non-compound base pages".
>>
>> [...]
>>
>>>>>> The non-zero value must be
>>>>>> PAGE_SIZE, PMD_PAGE_SIZE or PUD_PAGE_SIZE.
>>>>>> Hmm, maybe it should be an
>>>>>> enum:
>>>>>>
>>>>>> enum devmap_geometry {
>>>>>>     DEVMAP_PTE,
>>>>>>     DEVMAP_PMD,
>>>>>>     DEVMAP_PUD,
>>>>>> }
>>>>>>
>>>>> I suppose a converter between devmap_geometry and page_size would be needed too? And maybe
>>>>> the whole dax/nvdimm align values change meanwhile (as a followup improvement)?
>>>>
>>>> I think it is ok for dax/nvdimm to continue to maintain their align
>>>> value because it should be ok to have 4MB align if the device really
>>>> wanted. However, when it goes to map that alignment with
>>>> memremap_pages() it can pick a mode. For example, it's already the
>>>> case that dax->align == 1GB is mapped with DEVMAP_PTE today, so
>>>> they're already separate concepts that can stay separate.
>>>>
>>> Gotcha.
>>
>> I am reconsidering part of the above. In general, yes, the meaning of devmap @align
>> represents a slightly different variation of the device @align i.e. how the metadata is
>> laid out **but** regardless of what kind of page table entries we use vmemmap.
>>
>> By using DEVMAP_PTE/PMD/PUD we might end up 1) duplicating what nvdimm/dax already
>> validates in terms of allowed device @align values (i.e. PAGE_SIZE, PMD_SIZE and PUD_SIZE)
>> 2) the geometry of metadata is very much tied to the value we pick to @align at namespace
>> provisioning -- not the "align" we might use at mmap() perhaps that's what you referred
>> above? -- and 3) the value of geometry actually derives from dax device @align because we
>> will need to create compound pages representing a page size of @align value.
>>
>> Using your example above: you're saying that dax->align == 1G is mapped with DEVMAP_PTEs,
>> in reality the vmemmap is populated with PMDs/PUDs page tables (depending on what archs
>> decide to do at vmemmap_populate()) and uses base pages as its metadata regardless of what
>> device @align. In reality what we want to convey in @geometry is not page table sizes, but
>> just the page size used for the vmemmap of the dax device.
> 
> Good point, the names "PTE, PMD, PUD" imply the hardware mapping size,
> not the software compound page size.
> 
>> Additionally, limiting its
>> value might not be desirable... if tomorrow Linux for some arch supports dax/nvdimm
>> devices with 4M align or 64K align, the value of @geometry will have to reflect the 4M to
>> create compound pages of order 10 for the said vmemmap.
>>
>> I am going to wait until you finish reviewing the remaining four patches of this series,
>> but maybe this is a simple misnomer (s/align/geometry/) with a comment but without
>> DEVMAP_{PTE,PMD,PUD} enum part? Or perhaps its own struct with a value and enum a
>> setter/getter to audit its value? Thoughts?
> 
> I do see what you mean about the confusion DEVMAP_{PTE,PMD,PUD}
> introduces, but I still think the device-dax align and the
> organization of the 'struct page' metadata are distinct concepts. So
> I'm happy with any color of the bikeshed as long as the 2 concepts are
> distinct. How about calling it  "compound_page_order"? Open to other
> ideas...
> 
I actually like the name of @geometry. The only thing better would be @vmemmap_geometry
solely because it makes it clear that its the vmemmap that we are talking about -- but
might be unnecssarily verbose. And I still agree that is separate concept that should be
named differently *at least*.

But naming aside, I was trying to get at was to avoid a second geometry value validation
i.e. to be validated the value and set with a value such as DEVMAP_PTE, DEVMAP_PMD and
DEVMAP_PUD. That to me sounds a little redundant, when the geometry value depends on what
align is going to be used from. Here my metnion of @align refers to what's used to create
the dax device, not the mmap() align [which can be lower than the device one]. The dax
device align is the one used to decide whether to use PTEs, PMDs or PUDs at dax fault handler.

So separate concepts, but still its value dependent on one another. At least unless we
want to allow geometry values different than those set by --align as Jane suggested.




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux