Hello, On Fri, Oct 13, 2023 at 02:11:19PM -0400, Waiman Long wrote: > When the "isolcpus" boot command line option is used to add a set > of isolated CPUs, those CPUs will be excluded automatically from > wq_unbound_cpumask to avoid running work functions from unbound > workqueues. > > Recently cpuset has been extended to allow the creation of partitions > of isolated CPUs dynamically. To make it closer to the "isolcpus" > in functionality, the CPUs in those isolated cpuset partitions should be > excluded from wq_unbound_cpumask as well. This can be done currently by > explicitly writing to the workqueue's cpumask sysfs file after creating > the isolated partitions. However, this process can be error prone. > Ideally, the cpuset code should be allowed to request the workqueue code > to exclude those isolated CPUs from wq_unbound_cpumask so that this > operation can be done automatically and the isolated CPUs will be returned > back to wq_unbound_cpumask after the destructions of the isolated > cpuset partitions. > > This patch adds a new workqueue_unbound_exclude_cpumask() to enable > that. This new function will exclude the specified isolated CPUs > from wq_unbound_cpumask. To be able to restore those isolated CPUs > back after the destruction of isolated cpuset partitions, a new > wq_user_unbound_cpumask is added to store the user provided unbound > cpumask either from the boot command line options or from writing to > the cpumask sysfs file. This new cpumask provides the basis for CPU > exclusion. The behaviors around wq_unbound_cpumask is getting pretty inconsistent: 1. Housekeeping excludes isolated CPUs on boot but allows user to override it to include isolated CPUs afterwards. 2. If an unbound wq's cpumask doesn't have any intersection with wq_unbound_cpumask we ignore the per-wq cpumask and falls back to wq_unbound_cpumask. 3. You're adding a masking layer on top with exclude which fails to set if the intersection is empty. Can we do the followings for consistency? 1. User's requested_unbound_cpumask is stored separately (as in this patch). 2. The effect wq_unbound_cpumask is determined by requested_unbound_cpumask & housekeeping_cpumask & cpuset_allowed_cpumask. The operation order matters. When an & operation yields an cpumask, the cpumask from the previous step is the effective one. 3. Expose these cpumasks in sysfs so that what's happening is obvious. > +int workqueue_unbound_exclude_cpumask(cpumask_var_t exclude_cpumask) > +{ > + cpumask_var_t cpumask; > + int ret = 0; > + > + if (!zalloc_cpumask_var(&cpumask, GFP_KERNEL)) > + return -ENOMEM; > + > + /* > + * The caller of this function may have called cpus_read_lock(), > + * use cpus_read_trylock() to avoid potential deadlock. > + */ > + if (!cpus_read_trylock()) > + return -EBUSY; This means that a completely unrelated cpus_write_lock() can fail this operation and thus cpuset config writes. Let's please not do this. Can't we just make sure that the caller holds the lock? > + mutex_lock(&wq_pool_mutex); > + > + if (!cpumask_andnot(cpumask, wq_user_unbound_cpumask, exclude_cpumask)) > + ret = -EINVAL; /* The new cpumask can't be empty */ For better or worse, the usual mode-of-failure for "no usable CPU" is just falling back to something which works rather than failing the operation. Let's follow that. Thanks. -- tejun