Re: [PATCH v2 1/1] memory tier: consolidate the initialization of memory tiers

Jonathan Cameron <Jonathan.Cameron@xxxxxxxxxx> · Thu, 4 Jul 2024 18:08:23 +0100

Hi,

> > > 
> > >  static int __init memory_tier_init(void)
> > > 
> > >  {
> > > 
> > >  - int ret, node;
> > > 
> > >  - struct memory_tier *memtier;
> > > 
> > >  + int ret;
> > > 
> > >  
> > > 
> > >  ret = subsys_virtual_register(&memory_tier_subsys, NULL);
> > > 
> > >  if (ret)
> > > 
> > >  @@ -887,7 +897,8 @@ static int __init memory_tier_init(void)
> > > 
> > >  GFP_KERNEL);
> > > 
> > >  WARN_ON(!node_demotion);
> > > 
> > >  #endif
> > > 
> > >  - mutex_lock(&memory_tier_lock);
> > > 
> > >  +
> > > 
> > >  + guard(mutex)(&memory_tier_lock);
> > >   
> > 
> > If this was safe to do without the rest of the change (I think so)
> > 
> > then better to pull that out as a trivial precursor so less noise
> > 
> > in here.
> >   
> 
> Do you mean instead of using guard(mutex)(),
> use mutex_lock() as it was? or?
> 

Code as here, but possibly pull the guard(mutex) part out as
a patch 1 as it's an unrelated improvement to the rest of the set
which would be in patch 2.

Not particularly important though as you've sent a v3 in the
meantime and it's fine to have it in one patch.

> > > 
> > > /*
> > > 
> > >  * For now we can have 4 faster memory tiers with smaller adistance
> > > 
> > >  * than default DRAM tier.
> > > 
> > >  @@ -897,29 +908,9 @@ static int __init memory_tier_init(void)
> > > 
> > >  if (IS_ERR(default_dram_type))
> > > 
> > >  panic("%s() failed to allocate default DRAM tier\n", __func__);
> > > 
> > >  
> > > 
> > >  - /*
> > > 
> > >  - * Look at all the existing N_MEMORY nodes and add them to
> > > 
> > >  - * default memory tier or to a tier if we already have memory
> > > 
> > >  - * types assigned.
> > > 
> > >  - */
> > > 
> > >  - for_each_node_state(node, N_MEMORY) {
> > > 
> > >  - if (!node_state(node, N_CPU))
> > > 
> > >  - /*
> > > 
> > >  - * Defer memory tier initialization on
> > > 
> > >  - * CPUless numa nodes. These will be initialized
> > > 
> > >  - * after firmware and devices are initialized.
> > > 
> > >  - */
> > > 
> > >  - continue;
> > > 
> > >  -
> > > 
> > >  - memtier = set_node_memory_tier(node);
> > > 
> > >  - if (IS_ERR(memtier))
> > > 
> > >  - /*
> > > 
> > >  - * Continue with memtiers we are able to setup
> > > 
> > >  - */
> > > 
> > >  - break;
> > > 
> > >  - }
> > > 
> > >  - establish_demotion_targets();
> > > 
> > >  - mutex_unlock(&memory_tier_lock);
> > > 
> > >  + /* Record nodes with memory and CPU to set default DRAM performance. */
> > > 
> > >  + nodes_and(default_dram_nodes, node_states[N_MEMORY],
> > > 
> > >  + node_states[N_CPU]);
> > >   
> > 
> > There are systems where (for various esoteric reasons, such as describing an
> > 
> > association with some other memory that isn't DRAM where the granularity
> > 
> > doesn't match) the CPU nodes contain no DRAM but rather it's one node away.
> > 
> > Handling that can be a job for another day though.
> >   
> 
> Thank you for informing me of this situation.
> Sounds like handling that also requires a mapping table between
> the CPU and the corresponding DRAM.

I've not yet looked at how it interacts with this, but
from an ACPI point of view it's just 'near' in SLIT and
HMAT.  The nearest thing to a description is
Memory Proximity Domain Attributes structures in HMAT.
That allows you to describe the location of the memory
controller, but in this type of system there may be
a many to 1 mapping (interleaving across memory controllers
in some CPU only nodes) for example.

Anyhow, guess I need to spin up some emulated machines and
see what breaks :)

Jonathan