Sorry, didn't have much time to do a proper review. Couple of points here at least. On Wed 22-11-23 17:24:10, Gregory Price wrote: > On Wed, Nov 22, 2023 at 01:33:48PM -0800, Andrew Morton wrote: > > On Wed, 22 Nov 2023 16:11:49 -0500 Gregory Price <gourry.memverge@xxxxxxxxx> wrote: > > > > > The patch set changes task->mempolicy to be modifiable by tasks other > > > than just current. > > > > > > The ultimate goal is to make mempolicy more flexible and extensible, > > > such as adding interleave weights (which may need to change at runtime > > > due to hotplug events). Making mempolicy externally modifiable allows > > > for userland daemons to make runtime performance adjustments to running > > > tasks without that software needing to be made numa-aware. > > > > Please add to this [0/N] a full description of the security aspect: who > > can modify whose mempolicy, along with a full description of the > > reasoning behind this decision. > > > > Will do. For the sake of v0 for now: > > 1) the task itself (task == current) > for obvious reasons: it already can > > 2) from external interfaces: CAP_SYS_NICE Makes sense. [...] > > > 3. Add external interfaces which allow for a task mempolicy to be > > > modified by another task. This is implemented in 4 syscalls > > > and a procfs interface: > > > sys_set_task_mempolicy > > > sys_get_task_mempolicy > > > sys_set_task_mempolicy_home_node > > > sys_task_mbind > > > /proc/[pid]/mempolicy > > > > Why is the procfs interface needed? Doesn't it simply duplicate the > > syscall interface? Please update [0/N] with a description of this > > decision. > > > > Honestly I wrote the procfs interface first, and then came back around > to just implement the syscalls. mbind is not friendly to being procfs'd > so if the preference is to have only one, not both, then it should > probably be the syscalls. > > That said, when I introduce weighted interleave on top of this, having a > simple procfs interface to those weights would be valuable, so I > imagined something like `proc/mempolicy` to determine if interleave was > being used and something like `proc/mpol_interleave_weights` for a clean > interface to update weights. > > However, in the same breath, I have a prior RFC with set/get_mempolicy2 > which could probably take all future mempolicy extensions and wrap them > up into one pair of syscalls, instead of us ending up with 200 more > sys_mempolicy_whatever as memory attached fabrics become more common. > > So... yeah... the is one area I think the community very much needs to > comment: set/get_mempolicy2, many new mempolicy syscalls, procfs? All > of the above? I think we should actively avoid using proc interface. The most reasonable way would be to add get_mempolicy2 interface that would allow extensions and then create a pidfd counterpart to allow acting on a remote task. The latter would require some changes to make mempolicy code less current oriented. -- Michal Hocko SUSE Labs