On Thu, Nov 21, 2024 at 02:40:44PM +0100, David Hildenbrand wrote: > On 21.11.24 13:24, Zhen Ni wrote: > > 1.Removing redundant checks for current->mempolicy, with a more concise > > check order. > > > > 2.Using READ_ONCE(current->mempolicy) for safe, single access to > > current->mempolicy to prevent potential race conditions. > > > > 3.Optimizing the scope of task_lock(current). The lock now only protects > > the critical section where mempolicy is accessed, reducing the duration > > the lock is held. This enhances performance by limiting the scope of the > > lock to only what is necessary. > > While optimizing task lock scopes is admirable, we're talking about a scale of maybe a microsecond length access on a lock that's typically held quite a bit longer - if it's ever really held given the context of this function. What you'd actually want to do is the opposite: task_lock(current); mempolicy = get_mempolicy(); task_unlock(current); /* do work */ put_mempolicy(); but even this is only optimizing a small amount of work. And if you look at what init_nodemask_of_mempolicy is actually used for, there is no need for this optimization - the task is almost certainly short lived and intended to just adjust a system control value. > > Signed-off-by: Zhen Ni <zhen.ni@xxxxxxxxxxxx> > > --- > > mm/mempolicy.c | 7 +++++-- > > 1 file changed, 5 insertions(+), 2 deletions(-) > > > > diff --git a/mm/mempolicy.c b/mm/mempolicy.c > > index b646fab3e45e..8bff8830b7e6 100644 > > --- a/mm/mempolicy.c > > +++ b/mm/mempolicy.c > > @@ -2132,11 +2132,14 @@ bool init_nodemask_of_mempolicy(nodemask_t *mask) > > { > > struct mempolicy *mempolicy; > > - if (!(mask && current->mempolicy)) > > + if (!mask)> + return false; > > + > > + mempolicy = READ_ONCE(current->mempolicy); > > + if (!mempolicy) > > return false; > > task_lock(current); > > - mempolicy = current->mempolicy; > > switch (mempolicy->mode) { > > case MPOL_PREFERRED: > > case MPOL_PREFERRED_MANY: > > (a) current->mempolicy can only change under the task_lock(), see > do_set_mempolicy(). > > (b) current->mempolicy cannot usually NULL once it was none-NULL. There are > two exceptions: > > copy_process() might set it to NULL when creating the process, before the > task gets scheduled. > > do_exit() might set it to NULL via mpol_put_task_policy(). > > I don't think init_nodemask_of_mempolicy() can be called while our task is > exiting ... > Unless someone put an access to nr_hugepages_mempolicy in the exit path, I doubt it (also that would probably deadlock :]) > > Now, if you read current->mempolicy outside of task_lock() to the > *dereference* it when it might already have changed+was freed, that's a BUG > you're introducing. > 100% a bug, although since the task mempolicy can't currently change outside the context of the task - I can't think of a a way poke this particular bug. ~Gregory