On 2022-04-11 09:28, Greg KH wrote:
On Mon, Apr 11, 2022 at 09:18:19AM +0200, Holger Hoffstätte wrote:
On 2022-04-11 01:22, Holger Hoffstätte wrote:
On 2022-04-11 00:06, Qais Yousef wrote:
On 04/10/22 00:38, Qais Yousef wrote:
On 03/08/22 18:51, Qais Yousef wrote:
On 03/08/22 19:10, Greg KH wrote:
On Tue, Mar 08, 2022 at 06:02:40PM +0000, Qais Yousef wrote:
+CC stable
On 03/01/22 15:24, tip-bot2 for Valentin Schneider wrote:
The following commit has been merged into the sched/core branch of tip:
Commit-ID: fa2c3254d7cfff5f7a916ab928a562d1165f17bb
Gitweb: https://git.kernel.org/tip/fa2c3254d7cfff5f7a916ab928a562d1165f17bb
Author: Valentin Schneider <valentin.schneider@xxxxxxx>
AuthorDate: Thu, 20 Jan 2022 16:25:19
Committer: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
CommitterDate: Tue, 01 Mar 2022 16:18:39 +01:00
sched/tracing: Don't re-read p->state when emitting sched_switch event
As of commit
c6e7bd7afaeb ("sched/core: Optimize ttwu() spinning on p->on_cpu")
the following sequence becomes possible:
p->__state = TASK_INTERRUPTIBLE;
__schedule()
deactivate_task(p);
ttwu()
READ !p->on_rq
p->__state=TASK_WAKING
trace_sched_switch()
__trace_sched_switch_state()
task_state_index()
return 0;
TASK_WAKING isn't in TASK_REPORT, so the task appears as TASK_RUNNING in
the trace event.
Prevent this by pushing the value read from __schedule() down the trace
event.
Reported-by: Abhijeet Dharmapurikar <adharmap@xxxxxxxxxxx>
Signed-off-by: Valentin Schneider <valentin.schneider@xxxxxxx>
Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
Reviewed-by: Steven Rostedt (Google) <rostedt@xxxxxxxxxxx>
Link: https://lore.kernel.org/r/20220120162520.570782-2-valentin.schneider@xxxxxxx
Any objection to picking this for stable? I'm interested in this one for some
Android users but prefer if it can be taken by stable rather than backport it
individually.
I think it makes sense to pick the next one in the series too.
What commit does this fix in Linus's tree?
It should be this one: c6e7bd7afaeb ("sched/core: Optimize ttwu() spinning on p->on_cpu")
Should this be okay to be picked up by stable now? I can see AUTOSEL has picked
it up for v5.15+, but it impacts v5.10 too.
commit: fa2c3254d7cfff5f7a916ab928a562d1165f17bb
subject: sched/tracing: Don't re-read p->state when emitting sched_switch event
This patch has an impact on Android 5.10 users who experience tooling breakage.
Is it possible to include in 5.10 LTS please?
It was already picked up for 5.15+ by AUTOSEL and only 5.10 is missing.
https://lore.kernel.org/stable/Yk2PQzynOVOzJdPo@xxxxxxxxx/
However, since then further investigation (still in progress) has shown that this
may have been the fault of the tool in question, so if you can verify that tracing
sched still works for you with this patch in 5.15.x then by all means
let's merge it.
So it turns out the lockup is indeed the fault of the tool, which contains multiple
kernel-version dependent tracepoint definitions and now fails with this
patch.
What tools is this?
sysdig - which uses a helper kernel module which accesses tracepoints, but of course
(as I just found) with copypasta'd TP definitions, which broke with this patch due to
the additional parameter in the function signature. It's been prone to breakage forever
because of a lack of a stable kernel ABI.
Took me a while to find/figure out, but IMHO better safe than sorry. We've had
autoselected scheduler patches before that looked fine but really were not.
Greg, please re-enqueue this patch where necessary (5.10, 5.15+)
If I queue it up again, will the tools keep breaking?
Yes, but that's their problem with an out-of-tree module; a few more #ifdefs
are not going to make a big difference.
thanks
Holger