On 10/8/21 12:51 AM, Oscar Salvador wrote: > On Thu, Oct 07, 2021 at 11:19:14AM -0700, Mike Kravetz wrote: >> +static ssize_t demote_store(struct kobject *kobj, >> + struct kobj_attribute *attr, const char *buf, size_t len) >> +{ >> + unsigned long nr_demote; >> + unsigned long nr_available; >> + nodemask_t nodes_allowed, *n_mask; >> + struct hstate *h; >> + int err = 0; >> + int nid; >> + >> + err = kstrtoul(buf, 10, &nr_demote); >> + if (err) >> + return err; >> + h = kobj_to_hstate(kobj, &nid); >> + >> + /* Synchronize with other sysfs operations modifying huge pages */ >> + mutex_lock(&h->resize_lock); >> + >> + if (nid != NUMA_NO_NODE) { >> + init_nodemask_of_node(&nodes_allowed, nid); >> + n_mask = &nodes_allowed; >> + } else { >> + n_mask = &node_states[N_MEMORY]; >> + } > > Why this needs to be protected by the resize_lock? I do not understand > what are we really protecting here and from what. In general, the resize_lock prevents unexpected consequences when multiple users are modifying the number of pages in a pool concurrently from the proc/sysfs interfaces. The mutex is acquired here because we are modifying (decreasing) the pool size. When the mutex was added, Michal asked about the need. Theoretically, all code making pool adjustment should be safe because at a low level the hugetlb_lock is taken when the hstate is modified. However, I did point out that there is a hstate variable 'next_nid_to_alloc' which is used outside the lock which could result in pages being allocated from the wrong node. One could argue that if two (root) sysadmin users are modifying the pool concurrently, they should not be surprised by such consequences. The mutex seemed like a small price to avoid any such potential issues. It is taken here to be consistent with this model. Coincidentally, I was running some stress testing with this series last night and noticed some unexpected behavior. As a result, we will also need to take the resize_mutex of the 'target_hstate'. This is all in patch 5 of the series. I will add details there. -- Mike Kravetz