Alistair Popple <apopple@xxxxxxxxxx> writes: > "Huang, Ying" <ying.huang@xxxxxxxxx> writes: > >> Alistair Popple <apopple@xxxxxxxxxx> writes: >> >>> Huang Ying <ying.huang@xxxxxxxxx> writes: >>> >>>> Previously, a fixed abstract distance MEMTIER_DEFAULT_DAX_ADISTANCE is >>>> used for slow memory type in kmem driver. This limits the usage of >>>> kmem driver, for example, it cannot be used for HBM (high bandwidth >>>> memory). >>>> >>>> So, we use the general abstract distance calculation mechanism in kmem >>>> drivers to get more accurate abstract distance on systems with proper >>>> support. The original MEMTIER_DEFAULT_DAX_ADISTANCE is used as >>>> fallback only. >>>> >>>> Now, multiple memory types may be managed by kmem. These memory types >>>> are put into the "kmem_memory_types" list and protected by >>>> kmem_memory_type_lock. >>> >>> See below but I wonder if kmem_memory_types could be a common helper >>> rather than kdax specific? >>> >>>> Signed-off-by: "Huang, Ying" <ying.huang@xxxxxxxxx> >>>> Cc: Aneesh Kumar K.V <aneesh.kumar@xxxxxxxxxxxxx> >>>> Cc: Wei Xu <weixugc@xxxxxxxxxx> >>>> Cc: Alistair Popple <apopple@xxxxxxxxxx> >>>> Cc: Dan Williams <dan.j.williams@xxxxxxxxx> >>>> Cc: Dave Hansen <dave.hansen@xxxxxxxxx> >>>> Cc: Davidlohr Bueso <dave@xxxxxxxxxxxx> >>>> Cc: Johannes Weiner <hannes@xxxxxxxxxxx> >>>> Cc: Jonathan Cameron <Jonathan.Cameron@xxxxxxxxxx> >>>> Cc: Michal Hocko <mhocko@xxxxxxxxxx> >>>> Cc: Yang Shi <shy828301@xxxxxxxxx> >>>> Cc: Rafael J Wysocki <rafael.j.wysocki@xxxxxxxxx> >>>> --- >>>> drivers/dax/kmem.c | 54 +++++++++++++++++++++++++++--------- >>>> include/linux/memory-tiers.h | 2 ++ >>>> mm/memory-tiers.c | 2 +- >>>> 3 files changed, 44 insertions(+), 14 deletions(-) >>>> >>>> diff --git a/drivers/dax/kmem.c b/drivers/dax/kmem.c >>>> index 898ca9505754..837165037231 100644 >>>> --- a/drivers/dax/kmem.c >>>> +++ b/drivers/dax/kmem.c >>>> @@ -49,14 +49,40 @@ struct dax_kmem_data { >>>> struct resource *res[]; >>>> }; >>>> >>>> -static struct memory_dev_type *dax_slowmem_type; >>>> +static DEFINE_MUTEX(kmem_memory_type_lock); >>>> +static LIST_HEAD(kmem_memory_types); >>>> + >>>> +static struct memory_dev_type *kmem_find_alloc_memorty_type(int adist) >>>> +{ >>>> + bool found = false; >>>> + struct memory_dev_type *mtype; >>>> + >>>> + mutex_lock(&kmem_memory_type_lock); >>>> + list_for_each_entry(mtype, &kmem_memory_types, list) { >>>> + if (mtype->adistance == adist) { >>>> + found = true; >>>> + break; >>>> + } >>>> + } >>>> + if (!found) { >>>> + mtype = alloc_memory_type(adist); >>>> + if (!IS_ERR(mtype)) >>>> + list_add(&mtype->list, &kmem_memory_types); >>>> + } >>>> + mutex_unlock(&kmem_memory_type_lock); >>>> + >>>> + return mtype; >>>> +} >>>> + >>>> static int dev_dax_kmem_probe(struct dev_dax *dev_dax) >>>> { >>>> struct device *dev = &dev_dax->dev; >>>> unsigned long total_len = 0; >>>> struct dax_kmem_data *data; >>>> + struct memory_dev_type *mtype; >>>> int i, rc, mapped = 0; >>>> int numa_node; >>>> + int adist = MEMTIER_DEFAULT_DAX_ADISTANCE; >>>> >>>> /* >>>> * Ensure good NUMA information for the persistent memory. >>>> @@ -71,6 +97,11 @@ static int dev_dax_kmem_probe(struct dev_dax *dev_dax) >>>> return -EINVAL; >>>> } >>>> >>>> + mt_calc_adistance(numa_node, &adist); >>>> + mtype = kmem_find_alloc_memorty_type(adist); >>>> + if (IS_ERR(mtype)) >>>> + return PTR_ERR(mtype); >>>> + >>> >>> I wrote my own quick and dirty module to test this and wrote basically >>> the same code sequence. >>> >>> I notice your using a list of memory types here though. I think it would >>> be nice to have a common helper that other users could call to do the >>> mt_calc_adistance() / kmem_find_alloc_memory_type() / >>> init_node_memory_type() sequence and cleanup as my naive approach would >>> result in a new memory_dev_type per device even though adist might be >>> the same. A common helper would make it easy to de-dup those. >> >> If it's useful, we can move kmem_find_alloc_memory_type() to >> memory-tier.c after some revision. But I tend to move it after we have >> the second user. What do you think about that? > > Usually I would agree, but this series already introduces a general > interface for calculating adist even though there's only one user and > implementation. So if we're going to add a general interface I think it > would be better to make it more usable now rather than after variations > of it have been cut and pasted into other drivers. In general, I would like to introduce complexity when necessary. So, we can discuss the necessity of the general interface firstly. We can do that in [1/4] of the series. -- Best Regards, Huang, Ying