Re: [External] Re: [PATCH v2] mm: add new syscall pidfd_set_mempolicy().

Zhongkun He <hezhongkun.hzk@xxxxxxxxxxxxx> · Wed, 16 Nov 2022 17:38:09 +0800

Hi Ying, thanks for your replay and suggestions.

I suggest to move the flags in "mode" parameter (MPOL_F_STATIC_NODES,
MPOL_F_RELATIVE_NODES, MPOL_F_NUMA_BALANCING, etc.) to "flags"
parameter, otherwise, why add it?

The "flags" is used for future extension if any, just like
process_madvise() and set_mempolicy_home_node().
Maybe it should be removed.

And, how about add a "home_node" parameter?  I don't think that it's a
good idea to add another new syscall for pidfd_set_mempolicy_home_node()
in the future.

Good idea, but "home_node" is used for vma policy, not task policy.
It is possible to use it in pidfd_mbind() in the future.

IMHO, "The first four APIS" and "The last one" isn't easy to be
understood.  How about

"sys_pidfd_set_mempolicy sets the mempolicy of task specified in the
pidfd, the others affect only the calling task, ...".

Got it.

Why add "sys_"?  I fount that there's no "sys_" before set_mempolicy()/mbind() etc.

Got it.

+void mpol_put_async(struct task_struct *task, struct mempolicy *p)

How about change __mpol_put() directly?

Why can we fall back to freeing directly if task_work_add() failed?
Should we check the return code and fall back only if -ESRCH and WARN
for other cases?

A task_work based solution has not been accepted yet, it will be 
considered later if needed.

+	}

Why do we need to write lock mmap_sem?  IIUC, we don't touch vma.

Yes, it should be removed.

  /*

Because we will change task_struct->mempolicy in another task, we need
to use kind of "load acquire" / "store release" memory order.  For
example, rcu_dereference() / rcu_assign_pointer(), etc.

Thanks again for your suggestion.

Best Regards,
Zhongkun