On Thu, Mar 19, 2020 at 01:22:38PM -0400, Steven Rostedt wrote: > On Wed, 18 Mar 2020 17:10:39 -0700 > paulmck@xxxxxxxxxx wrote: > > > From: "Paul E. McKenney" <paulmck@xxxxxxxxxx> > > > > A running task's state can be sampled in a consistent manner (for example, > > for diagnostic purposes) simply by invoking smp_call_function_single() > > on its CPU, which may be obtained using task_cpu(), then having the > > IPI handler verify that the desired task is in fact still running. > > However, if the task is not running, this sampling can in theory be done > > immediately and directly. In practice, the task might start running at > > any time, including during the sampling period. Gaining a consistent > > sample of a not-running task therefore requires that something be done > > to lock down the target task's state. > > > > This commit therefore adds a try_invoke_on_locked_down_task() function > > that invokes a specified function if the specified task can be locked > > down, returning true if successful and if the specified function returns > > true. Otherwise this function simply returns false. Given that the > > function passed to try_invoke_on_nonrunning_task() might be invoked with > > a runqueue lock held, that function had better be quite lightweight. > > > > The function is passed the target task's task_struct pointer and the > > argument passed to try_invoke_on_locked_down_task(), allowing easy access > > to task state and to a location for further variables to be passed in > > and out. > > > > Note that the specified function will be called even if the specified > > task is currently running. The function can use ->on_rq and task_curr() > > to quickly and easily determine the task's state, and can return false > > if this state is not to the function's liking. The caller of teh > > s/teh/the/ Good eyes, will fix! > > try_invoke_on_locked_down_task() would then see the false return value, > > and could take appropriate action, for example, trying again later or > > sending an IPI if matters are more urgent. > > > > It is expected that use cases such as the RCU CPU stall warning code will > > simply return false if the task is currently running. However, there are > > use cases involving nohz_full CPUs where the specified function might > > instead fall back to an alternative sampling scheme that relies on heavier > > synchronization (such as memory barriers) in the target task. > > > > Signed-off-by: Paul E. McKenney <paulmck@xxxxxxxxxx> > > [ paulmck: Apply feedback from Peter Zijlstra and Steven Rostedt. ] > > [ paulmck: Invoke if running to handle feedback from Mathieu Desnoyers. ] > > Cc: Ingo Molnar <mingo@xxxxxxxxxx> > > Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx> > > Cc: Juri Lelli <juri.lelli@xxxxxxxxxx> > > Cc: Vincent Guittot <vincent.guittot@xxxxxxxxxx> > > Cc: Dietmar Eggemann <dietmar.eggemann@xxxxxxx> > > Cc: Ben Segall <bsegall@xxxxxxxxxx> > > Cc: Mel Gorman <mgorman@xxxxxxx> > > --- > > include/linux/wait.h | 2 ++ > > kernel/sched/core.c | 48 ++++++++++++++++++++++++++++++++++++++++++++++++ > > 2 files changed, 50 insertions(+) > > > > diff --git a/include/linux/wait.h b/include/linux/wait.h > > index 3283c8d..e2bb8ed 100644 > > --- a/include/linux/wait.h > > +++ b/include/linux/wait.h > > @@ -1148,4 +1148,6 @@ int autoremove_wake_function(struct wait_queue_entry *wq_entry, unsigned mode, i > > (wait)->flags = 0; \ > > } while (0) > > > > +bool try_invoke_on_locked_down_task(struct task_struct *p, bool (*func)(struct task_struct *t, void *arg), void *arg); > > + > > #endif /* _LINUX_WAIT_H */ > > diff --git a/kernel/sched/core.c b/kernel/sched/core.c > > index fc1dfc0..195eba0 100644 > > --- a/kernel/sched/core.c > > +++ b/kernel/sched/core.c > > @@ -2580,6 +2580,8 @@ try_to_wake_up(struct task_struct *p, unsigned int state, int wake_flags) > > * > > * Pairs with the LOCK+smp_mb__after_spinlock() on rq->lock in > > * __schedule(). See the comment for smp_mb__after_spinlock(). > > + * > > + * A similar smb_rmb() lives in try_invoke_on_locked_down_task(). > > */ > > smp_rmb(); > > if (p->on_rq && ttwu_remote(p, wake_flags)) > > @@ -2654,6 +2656,52 @@ try_to_wake_up(struct task_struct *p, unsigned int state, int wake_flags) > > } > > > > /** > > + * try_invoke_on_locked_down_task - Invoke a function on task in fixed state > > + * @p: Process for which the function is to be invoked. > > + * @func: Function to invoke. > > + * @arg: Argument to function. > > + * > > + * If the specified task can be quickly locked into a definite state > > + * (either sleeping or on a given runqueue), arrange to keep it in that > > + * state while invoking @func(@arg). This function can use ->on_rq and > > + * task_curr() to work out what the state is, if required. Given that > > + * @func can be invoked with a runqueue lock held, it had better be quite > > + * lightweight. > > + * > > + * Returns: > > + * @false if the task slipped out from under the locks. > > + * @true if the task was locked onto a runqueue or is sleeping. > > + * However, @func can override this by returning @false. > > Should probably state that it will return false if the state could be > locked, otherwise it returns the return code of the function. So like this? * Returns: * @false if the task state could not be locked. * Otherwise, the return value from @func(arg). > I'm wondering if we shouldn't have the function return code be something > passed in by the parameter, and have this return either true (locked and > function called), or false (not locked and function wasn't called). I was thinking of this as one of the possible uses of whatever arg points to, which allows the caller of try_invoke_on_locked_down_task() and the specified function to communicate whatever they wish. Then the specified function could (for example) unconditionally return true so that the return value from try_invoke_on_locked_down_task() indicated whether or not the specified function was called. The current setup is very convenient for the use cases thus far. It allows the function to say "Yeah, I was called, but I couldn't do anything", thus allowing the caller to make exactly one check to know that corrective action is required. > > + */ > > +bool try_invoke_on_locked_down_task(struct task_struct *p, bool (*func)(struct task_struct *t, void *arg), void *arg) > > +{ > > + bool ret = false; > > + struct rq_flags rf; > > + struct rq *rq; > > + > > + lockdep_assert_irqs_enabled(); > > + raw_spin_lock_irq(&p->pi_lock); > > + if (p->on_rq) { > > + rq = __task_rq_lock(p, &rf); > > + if (task_rq(p) == rq) > > + ret = func(p, arg); > > + rq_unlock(rq, &rf); > > + } else { > > + switch (p->state) { > > + case TASK_RUNNING: > > + case TASK_WAKING: > > + break; > > + default: > > Don't we need a comment here about why we have a rmb() and where the > matching wmb() is? Indeed we do! I will fix that as well. Thanx, Paul > -- Steve > > > + smp_rmb(); > > + if (!p->on_rq) > > + ret = func(p, arg); > > + } > > + } > > + raw_spin_unlock_irq(&p->pi_lock); > > + return ret; > > +} > > + > > +/** > > * wake_up_process - Wake up a specific process > > * @p: The process to be woken up. > > * >