Re: [PATCH v3 3/3] mm/mempolicy: Support memory hotplug in weighted interleave

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Mar 20, 2025 at 01:17:48PM +0900, Rakie Kim wrote:
... snip ...
> +	mutex_lock(&sgrp->kobj_lock);
> +	if (sgrp->nattrs[nid]) {
> +		mutex_unlock(&sgrp->kobj_lock);
> +		pr_info("Node [%d] already exists\n", nid);
> +		kfree(new_attr);
> +		kfree(name);
> +		return 0;
> +	}
>  
> -	if (sysfs_create_file(&sgrp->wi_kobj, &node_attr->kobj_attr.attr)) {
> -		kfree(node_attr->kobj_attr.attr.name);
> -		kfree(node_attr);
> -		pr_err("failed to add attribute to weighted_interleave\n");
> -		return -ENOMEM;
> +	sgrp->nattrs[nid] = new_attr;
> +	mutex_unlock(&sgrp->kobj_lock);
> +
> +	sysfs_attr_init(&sgrp->nattrs[nid]->kobj_attr.attr);
> +	sgrp->nattrs[nid]->kobj_attr.attr.name = name;
> +	sgrp->nattrs[nid]->kobj_attr.attr.mode = 0644;
> +	sgrp->nattrs[nid]->kobj_attr.show = node_show;
> +	sgrp->nattrs[nid]->kobj_attr.store = node_store;
> +	sgrp->nattrs[nid]->nid = nid;

These accesses need to be inside the lock as well.  Probably we can't
get here concurrently, but I can't so so definitively that I'm
comfortable blind-accessing it outside the lock.

> +static int wi_node_notifier(struct notifier_block *nb,
> +			       unsigned long action, void *data)
> +{
... snip ...
> +	case MEM_OFFLINE:
> +		sysfs_wi_node_release(nid);

I'm still not convinced this is correct.  `offline_pages()` says this:

/*
 * {on,off}lining is constrained to full memory sections (or more
 * precisely to memory blocks from the user space POV).
 */

And that is the function calling:
	memory_notify(MEM_OFFLINE, &arg);

David pointed out that this should be called when offlining each memory
block.  This is not the same as simply doing `echo 0 > online`, you need
to remove the dax device associated with the memory.

For example:

      node1
    /       \
 dax0.0    dax1.0
   |          |
  mb1        mb2


With this code, if I `daxctl reconfigure-device devmem dax0.0` it will
remove the first memory block, causing MEM_OFFLINE event to fire and
removing the node - despite the fact that dax1.0 is still present.

This matters for systems with memory holes in CXL hotplug memory and
also for systems with Dynamic Capacity Devices surfacing capacity as
separate dax devices.

~Gregory




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux