On Mon, Jul 31, 2023 at 5:37 PM Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote: > > On Wed, Jul 26, 2023 at 08:16:16PM +0800, Ze Gao wrote: > > Internal representations of task state are likely to be changed or > > ordered, and reporting them to userspace without exporting them as > > part of API is not a good choice, which can easily break a userspace > > observability tool as kernel evolves. For example, perf suffers from > > this and still reports wrong states by this patch. > > > > OTOH, some masqueraded state like TASK_REPORT_IDLE and TASK_REPORT_MAX > > are also reported inadvertently, which confuses things even more. > > > > So add a new variable in company with the old raw value to report task > > state in symbolic char, which is self-explaining and no further > > translation is needed, and also report priorities in 'short' to save > > some buffer space. Of course this does not break any userspace tool. > > > > Note for PREEMPT_ACTIVE, we introduce 'p' to report it and use the old > > conventions for the rest. > > So I really dont much like this. This looses the ability to see the > actual wait state flags, there could be multiple. Eg, things like > TASK_FREEZEABLE gets lost completely. Also, IIRC, TASK_FREEZABLE which is defined as 0x2000, is already lost in the current implementation of __trace_sched_switch_state which limits all states except PREEMPT_ACTIIVE below TASK_REPORT_IDLE to be reported. So I do not believe you can achieve this by just leaving things alone. Regards, Ze