Re: [PATCH v11 4/8] mm/demotion/dax/kmem: Set node's abstract distance to MEMTIER_ADISTANCE_PMEM

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 8/1/22 7:36 AM, Huang, Ying wrote:
> "Aneesh Kumar K.V" <aneesh.kumar@xxxxxxxxxxxxx> writes:
> 
>> "Huang, Ying" <ying.huang@xxxxxxxxx> writes:
>>
>>> "Aneesh Kumar K.V" <aneesh.kumar@xxxxxxxxxxxxx> writes:
>>>
>>>> By default, all nodes are assigned to the default memory tier which
>>>> is the memory tier designated for nodes with DRAM
>>>>
>>>> Set dax kmem device node's tier to slower memory tier by assigning
>>>> abstract distance to MEMTIER_ADISTANCE_PMEM. PMEM tier
>>>> appears below the default memory tier in demotion order.
>>>>
>>>> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@xxxxxxxxxxxxx>
>>>> ---
>>>>  drivers/dax/kmem.c           |  9 +++++++++
>>>>  include/linux/memory-tiers.h | 19 ++++++++++++++++++-
>>>>  mm/memory-tiers.c            | 28 ++++++++++++++++------------
>>>>  3 files changed, 43 insertions(+), 13 deletions(-)
>>>>
>>>> diff --git a/drivers/dax/kmem.c b/drivers/dax/kmem.c
>>>> index a37622060fff..6b0d5de9a3e9 100644
>>>> --- a/drivers/dax/kmem.c
>>>> +++ b/drivers/dax/kmem.c
>>>> @@ -11,6 +11,7 @@
>>>>  #include <linux/fs.h>
>>>>  #include <linux/mm.h>
>>>>  #include <linux/mman.h>
>>>> +#include <linux/memory-tiers.h>
>>>>  #include "dax-private.h"
>>>>  #include "bus.h"
>>>>  
>>>> @@ -41,6 +42,12 @@ struct dax_kmem_data {
>>>>  	struct resource *res[];
>>>>  };
>>>>  
>>>> +static struct memory_dev_type default_pmem_type  = {
>>>
>>> Why is this named as default_pmem_type?  We will not change the memory
>>> type of a node usually.
>>>
>>
>> Any other suggestion? pmem_dev_type? 
> 
> Or dax_pmem_type?
> 
> DAX is used to enumerate the memory device.
> 
>>
>>>> +	.adistance = MEMTIER_ADISTANCE_PMEM,
>>>> +	.tier_sibiling = LIST_HEAD_INIT(default_pmem_type.tier_sibiling),
>>>> +	.nodes  = NODE_MASK_NONE,
>>>> +};
>>>> +
>>>>  static int dev_dax_kmem_probe(struct dev_dax *dev_dax)
>>>>  {
>>>>  	struct device *dev = &dev_dax->dev;
>>>> @@ -62,6 +69,8 @@ static int dev_dax_kmem_probe(struct dev_dax *dev_dax)
>>>>  		return -EINVAL;
>>>>  	}
>>>>  
>>>> +	init_node_memory_type(numa_node, &default_pmem_type);
>>>> +
>>>
>>> The memory hot-add below may fail.  So the error handling needs to be
>>> added.
>>>
>>> And, it appears that the memory type and memory tier of a node may be
>>> fully initialized here before NUMA hot-adding started.  So I suggest to
>>> set node_memory_types[] here only.  And set memory_dev_type->nodes in
>>> node hot-add callback.  I think there is the proper place to complete
>>> the initialization.
>>>
>>> And, in theory dax/kmem.c can be unloaded.  So we need to clear
>>> node_memory_types[] for nodes somewhere.
>>>
>>
>> I guess by module exit we can be sure that all the memory managed
>> by dax/kmem is hotplugged out. How about something like below?
> 
> Because we set node_memorty_types[] in dev_dax_kmem_probe(), it's
> natural to clear it in dev_dax_kmem_remove().
> 

Most of required reset/clear is done as part of memory hotunplug. So
if we did manage to successfully unplug the memory, everything except
node_memory_types[node] should be reset. That makes the clear_node_memory_type
the below. 

void clear_node_memory_type(int node, struct memory_dev_type *memtype)
{

	mutex_lock(&memory_tier_lock);
	/*
	 * memory unplug did clear the node from the memtype and
	 * dax/kem did initialize this node's memory type.
	 */
	if (!node_isset(node, memtype->nodes) && node_memory_types[node]  == memtype){
		node_memory_types[node] = NULL;
	}
	mutex_unlock(&memory_tier_lock);
}

With the module unload, it is kind of force removing the usage of the specific memtype.
Considering module unload will remove the usage of specific memtype from other parts
of the kernel and we already do all the required reset in memory hot unplug, do we
need to do the clear_node_memory_type above? 

-aneesh







[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux