On Mon 27-11-23 11:14:44, Gregory Price wrote: > On Mon, Nov 27, 2023 at 04:29:56PM +0100, Michal Hocko wrote: > > Sorry, didn't have much time to do a proper review. Couple of points > > here at least. > > > > > > > > So... yeah... the is one area I think the community very much needs to > > > comment: set/get_mempolicy2, many new mempolicy syscalls, procfs? All > > > of the above? > > > > I think we should actively avoid using proc interface. The most > > reasonable way would be to add get_mempolicy2 interface that would allow > > extensions and then create a pidfd counterpart to allow acting on a > > remote task. The latter would require some changes to make mempolicy > > code less current oriented. > > Sounds good, I'll pull my get/set_mempolicy2 RFC on top of this. > > Just context: patches 1-6 refactor mempolicy to allow remote task > twiddling (fixing the current-oriented issues), and patch 7 adds the pidfd > interfaces you describe above. > > > Couple Questions > > 1) Should we consider simply adding a pidfd arg to set/get_mempolicy2, > where if (pidfd == 0), then it operates on current, otherwise it > operates on the target task? That would mitigate the need for what > amounts to the exact same interface. This wouldn't fit into existing pidfd interfaces I am aware of. We assume pidfd to be real fd, no special cases. > 2) Should we combine all the existing operations into set_mempolicy2 and > add an operation arg. > > set_mempolicy2(pidfd, arg_struct, len) > > struct { > int pidfd; /* optional */ > int operation; /* describe which op_args to use */ > union { > struct { > } set_mempolicy; > struct { > } set_vma_home_node; > struct { > } mbind; > ... > } op_args; > } args; > > capturing: > sys_set_mempolicy > sys_set_mempolicy_home_node > sys_mbind > > or should we just make a separate interface for mbind/home_node to > limit complexity of the single syscall? My preference would be to go with specific syscalls. Multiplexing syscalls have turned much more complex and less flexible over time. Just have a look at futex. -- Michal Hocko SUSE Labs