On Tue, Aug 1, 2023 at 7:34 PM Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote: > > On Tue, Aug 01, 2023 at 05:01:21PM +0800, Ze Gao wrote: > > Internal representations of task state are likely to be changed > > or ordered, and reporting them to userspace without exporting > > them as part of API is basically wrong, which can easily break > > a userspace observability tool as kernel evolves. For example, > > perf suffers from this and still reports wrong states as of this > > writing. > > > > OTOH, some masqueraded states like TASK_REPORT_IDLE and > > TASK_REPORT_MAX are also reported inadvertently, which confuses > > things even more and most userspace tools do not even take them > > into consideration. > > > > So add a new variable in company with the old raw value to > > report task state in symbolic chars, which are self-explaining > > and no further translation is needed. Of course this does not > > break any userspace tool. > > > > Note for PREEMPT_ACTIVE, we introduce 'p' to report it and use > > the old conventions for the rest. > > *sigh*... just because userspace if daft, we need to change the kernel? Hi Peter, Sorry that I don't quite agree with you on this one. It's just the design that exporting internal details is fundamentally wrong. And even worse, I did not see any userspace tool is aware of masqueraded states like TASK_REPORT_IDLE and TASK_REPORT_MAX and let alone parse it correctly. This confused me a lot when I decided to write my own bpf version of sched-latency analysis tool and only after I figured out everything underneath, I started to make things right here. Again, I mean it's not me that deliberately "breaks" ABI here and I am never meant to upset anyone. My confusion is why did people forget to update in-tree perf the very last time they decide to rearrange the task state mapping since we all agree this is important "ABI" here. I don't think it's the tool's fault. And that's my initiative to request this RFC. > Why do we need this character anyway, why not just print the state in > hex and leave it at that? These single character state things are a > relic, please just let them die. I believe hex is ok only after having the reported task state mapping appear in the uapi headers, otherwise it's still useless to userspace especially for value like TASK_REPORT_IDLE and TASK_REPORT_MAX, which need to dig into the kernel to see what the hell is going on here. Thoughts? Regards, Ze