Re: [PATCH v3] ps: Add support to "ps -l|-m" to properly display process

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello, Lianbo

2022년 3월 16일 (수) 오전 11:07, lijiang <lijiang@xxxxxxxxxx>님이 작성:
>
> On Tue, Mar 15, 2022 at 7:17 PM Austin Kim <austindh.kim@xxxxxxxxx> wrote:
>>
>> Hello, Lijiang
>>
>> 2022년 3월 15일 (화) 오후 2:29, lijiang <lijiang@xxxxxxxxxx>님이 작성:
>> >
>> > On Tue, Mar 15, 2022 at 1:08 PM lijiang <lijiang@xxxxxxxxxx> wrote:
>> >>
>> >> On Thu, Mar 3, 2022 at 10:43 AM <crash-utility-request@xxxxxxxxxx> wrote:
>> >>>
>> >>> Date: Wed, 2 Mar 2022 21:37:45 +0000
>> >>> From: Austin Kim <austindh.kim@xxxxxxxxx>
>> >>> To: k-hagio-ab@xxxxxxx, lijiang@xxxxxxxxxx, crash-utility@xxxxxxxxxx
>> >>> Cc: kernel-team@xxxxxxx, mikeseohyungjin@xxxxxxxxx
>> >>> Subject:  [PATCH v3] ps: Add support to "ps -l|-m" to
>> >>>         properly display process
>> >>> Message-ID: <20220302213745.GA868@raspberrypi>
>> >>> Content-Type: text/plain; charset=us-ascii
>> >>>
>> >>> Sometimes kernel image is generated without CONFIG_SCHEDSTATS or CONFIG_SCHED_INFO.
>> >>> Where relevant commit id is f6db83479932 ("sched/stat: Simplify the sched_info accounting")
>> >>>
>> >>>   - CONFIG_SCHED_INFO: KERNEL_VERSION >= LINUX(4,2,0)
>> >>>   - CONFIG_SCHEDSTATS: KERNEL_VERSION < LINUX(4,2,0)
>> >>>
>> >>> Running crash-utility with above kernel image,
>> >>> "ps -l" option cannot display all processes sorted with most recently-run process.
>> >>> Also "ps -m" option cannot display all processes with timestamp.
>> >>>
>> >>> crash> ps -l or crash> ps -m
>> >>> ps: last-run timestamps do not exist in this kernel
>> >>> Usage: ps [-k|-u|-G] [-s]
>> >>>   [-p|-c|-t|-[l|m][-C cpu]|-a|-g|-r|-S]
>> >>>      [pid | task | command] ...
>> >>> Enter "help ps" for details.
>> >>>
>> >>> This is because output of "ps -l|-m" depends on task_struct.sched_info.last_arrival.
>> >>>
>> >>> Without CONFIG_SCHEDSTATS or CONFIG_SCHED_INFO, 'sched_info' field is not included
>> >>> in task_struct.  In this case we make "ps -l|-m" option to access 'exec_start'
>> >>> field of sched_entity where 'exec_start' is task_struct.se.exec_start.
>> >>>
>> >>> The 'task_struct.se.exec_start' contains the most recently-executed timestamp
>> >>> when process is running in the below cases;
>> >>>
>> >>>  - enqueued to runqueue
>> >>>  - dequeued from runqueue
>> >>>  - scheduler tick is invoked
>> >>>  - etc
>> >>>
>> >>> 'task_struct.se.exec_start' could be one of statistics which indicates the most
>> >>> recently-run timestamp of process activity.
>> >>
>> >>
>> >> Thank you for the update,  Austin.
>> >>
>> >> If the task is migraged,
>> >
>> >                          ^^^^^^^^
>> >
>> > Sorry, it's a typo, replace it with the "migrated". Thanks.
>> >
>> >>
>> >> the value of the 'task_struct.se.exec_start ' will be set to zero and may be updated in the next scheduling with the new time. In this case, does the patch still work well as expected?
>>
>> Thanks for feedback. Theoretically you are right.
>>
>> https://elixir.bootlin.com/linux/v5.15.28/source/kernel/sched/fair.c
>> static void migrate_task_rq_fair(struct task_struct *p, int new_cpu)
>> {
>> ..
>> /* We have migrated, no longer consider this task hot */
>> p->se.exec_start = 0;
>>
>> update_scan_period(p, new_cpu);
>> }
>>
>> If we set break point at 'update_scan_period(p, new_cpu)' and then
>> system stops at this point using debugger like GDB,
>> we can have misleading information as you said.
>>
>> Without this condition, it is unlikely to see p->se.exec_start is 0
>> since 'p->se.exec_start' is quite updated very often(ex: scheduler
>> tick, etc) on real target device.
>
>
> Thank you for the explanation, Austin.
>
> In multi-core systems, the task migration may occur frequently due to the load balancing, which
> is more common than breakpoints. Unless the task is bound to the cpu or running on a single cpu system.
>

Thanks for pointing out load balancing. I should have checked kernel
routines more.

> The sched entity is associated with a specific cpu(cpu rq), that is to say, the "-C" option is used implicitly in this patch.  I would suggest pointing out this behavior in the help or outputting a warning explicitly for users. That is my concern.
>

Let me make sure that the updated command will print debug information
together with "-C" option, later.
Thanks for valuable feedback.

BR,
Austin Kim

> Thanks.
> Lianbo
>
>>
>> As below link indicated, Kazu already mentioned that there are no
>> useful way to display all processes
>> with recently-run order using se.exec_start from crash-utility point
>> of view. Also I agreed the feedback.
>>
>> https://listman.redhat.com/archives/crash-utility/2022-March/009615.html
>>
>> Anyway thanks for feedback over patch in details. Hope to discuss
>> another patch I will propose.
>>
>> BR,
>> Austin Kim
>>
>> >>
>> >> Thanks.
>> >> Lianbo
>> >>
>> >>> With this patch, "ps -l|-m" option works well without CONFIG_SCHEDSTATS or
>> >>> CONFIG_SCHED_INFO.
>> >>>
>> >>> Signed-off-by: Austin Kim <austindh.kim@xxxxxxxxx>
>> >>> ---
>> >>>  defs.h    |  1 +
>> >>>  help.c    |  5 +++--
>> >>>  symbols.c |  2 ++
>> >>>  task.c    | 20 ++++++++++++++++----
>> >>>  4 files changed, 22 insertions(+), 6 deletions(-)
>> >>>
>> >>> diff --git a/defs.h b/defs.h
>> >>> index bf2c59b..841bd0b 100644
>> >>> --- a/defs.h
>> >>> +++ b/defs.h
>> >>> @@ -2168,6 +2168,7 @@ struct offset_table {                    /* stash of commonly-used offsets */
>> >>>         long sbitmap_queue_min_shallow_depth;
>> >>>         long sbq_wait_state_wait_cnt;
>> >>>         long sbq_wait_state_wait;
>> >>> +       long sched_entity_exec_start;
>> >>>  };
>> >>>
>> >>>  struct size_table {         /* stash of commonly-used sizes */
>> >>> diff --git a/help.c b/help.c
>> >>> index 8347668..6ca7c92 100644
>> >>> --- a/help.c
>> >>> +++ b/help.c
>> >>> @@ -1442,7 +1442,8 @@ char *help_ps[] = {
>> >>>  "           and system times.",
>> >>>  "       -l  display the task's last-run timestamp value, using either the",
>> >>>  "           task_struct's last_run value, the task_struct's timestamp value",
>> >>> -"           or the task_struct's sched_entity last_arrival value, whichever",
>> >>> +"           the task_struct's sched_info last_arrival value",
>> >>> +"           or the task_struct's sched_entity exec_start value, whichever",
>> >>>  "           applies, of selected, or all, tasks; the list is sorted with the",
>> >>>  "           most recently-run task (with the largest timestamp) shown first,",
>> >>>  "           followed by the task's current state.",
>> >>> @@ -1621,7 +1622,7 @@ char *help_ps[] = {
>> >>>  "     >  9497      1   0  ffff880549ec2ab0  RU   0.0 42314692 138664  oracle",
>> >>>  " ",
>> >>>  "  Show all tasks sorted by their task_struct's last_run, timestamp, or",
>> >>> -"  sched_entity last_arrival timestamp value, whichever applies:\n",
>> >>> +"  sched_info last_arrival or sched_entity exec_start timestamp value, whichever applies:\n",
>> >>>  "    %s> ps -l",
>> >>>  "    [20811245123] [IN] PID: 37    TASK: f7153030  CPU: 2  COMMAND: \"events/2\"",
>> >>>  "    [20811229959] [IN] PID: 1756  TASK: f2a5a570  CPU: 2  COMMAND: \"ntpd\"",
>> >>> diff --git a/symbols.c b/symbols.c
>> >>> index ba5e274..1c40586 100644
>> >>> --- a/symbols.c
>> >>> +++ b/symbols.c
>> >>> @@ -10290,6 +10290,8 @@ dump_offset_table(char *spec, ulong makestruct)
>> >>>                 OFFSET(sched_entity_my_q));
>> >>>         fprintf(fp, "            sched_entity_on_rq: %ld\n",
>> >>>                 OFFSET(sched_entity_on_rq));
>> >>> +       fprintf(fp, "            sched_entity_exec_start: %ld\n",
>> >>> +               OFFSET(sched_entity_exec_start));
>> >>>         fprintf(fp, "             cfs_rq_nr_running: %ld\n",
>> >>>                 OFFSET(cfs_rq_nr_running));
>> >>>         fprintf(fp, "            cfs_rq_rb_leftmost: %ld\n",
>> >>> diff --git a/task.c b/task.c
>> >>> index 864c838..2c12196 100644
>> >>> --- a/task.c
>> >>> +++ b/task.c
>> >>> @@ -334,9 +334,15 @@ task_init(void)
>> >>>         if (VALID_MEMBER(task_struct_sched_info))
>> >>>                 MEMBER_OFFSET_INIT(sched_info_last_arrival,
>> >>>                         "sched_info", "last_arrival");
>> >>> +       MEMBER_OFFSET_INIT(task_struct_se, "task_struct", "se");
>> >>> +       if (VALID_MEMBER(task_struct_se)) {
>> >>> +               STRUCT_SIZE_INIT(sched_entity, "sched_entity");
>> >>> +               MEMBER_OFFSET_INIT(sched_entity_exec_start, "sched_entity", "exec_start");
>> >>> +       }
>> >>>         if (VALID_MEMBER(task_struct_last_run) ||
>> >>>             VALID_MEMBER(task_struct_timestamp) ||
>> >>> -           VALID_MEMBER(sched_info_last_arrival)) {
>> >>> +           VALID_MEMBER(sched_info_last_arrival) ||
>> >>> +           VALID_MEMBER(sched_entity_exec_start)) {
>> >>>                 char buf[BUFSIZE];
>> >>>                 strcpy(buf, "alias last ps -l");
>> >>>                 alias_init(buf);
>> >>> @@ -3559,7 +3565,8 @@ cmd_ps(void)
>> >>>                 case 'm':
>> >>>                         if (INVALID_MEMBER(task_struct_last_run) &&
>> >>>                             INVALID_MEMBER(task_struct_timestamp) &&
>> >>> -                           INVALID_MEMBER(sched_info_last_arrival)) {
>> >>> +                           INVALID_MEMBER(sched_info_last_arrival) &&
>> >>> +                           INVALID_MEMBER(sched_entity_exec_start)) {
>> >>>                                 error(INFO,
>> >>>                              "last-run timestamps do not exist in this kernel\n");
>> >>>                                 argerrs++;
>> >>> @@ -3574,7 +3581,8 @@ cmd_ps(void)
>> >>>                 case 'l':
>> >>>                         if (INVALID_MEMBER(task_struct_last_run) &&
>> >>>                             INVALID_MEMBER(task_struct_timestamp) &&
>> >>> -                           INVALID_MEMBER(sched_info_last_arrival)) {
>> >>> +                           INVALID_MEMBER(sched_info_last_arrival) &&
>> >>> +                           INVALID_MEMBER(sched_entity_exec_start)) {
>> >>>                                 error(INFO,
>> >>>                              "last-run timestamps do not exist in this kernel\n");
>> >>>                                 argerrs++;
>> >>> @@ -6020,7 +6028,11 @@ task_last_run(ulong task)
>> >>>                 timestamp = tt->last_task_read ?  ULONGLONG(tt->task_struct +
>> >>>                         OFFSET(task_struct_sched_info) +
>> >>>                         OFFSET(sched_info_last_arrival)) : 0;
>> >>> -
>> >>> +       else if (VALID_MEMBER(sched_entity_exec_start))
>> >>> +                       timestamp = tt->last_task_read ?  ULONGLONG(tt->task_struct +
>> >>> +                       OFFSET(task_struct_se) +
>> >>> +                       OFFSET(sched_entity_exec_start)) : 0;
>> >>> +
>> >>>          return timestamp;
>> >>>  }
>> >>>
>> >>> --
>> >>> 2.20.1
>>

--
Crash-utility mailing list
Crash-utility@xxxxxxxxxx
https://listman.redhat.com/mailman/listinfo/crash-utility
Contribution Guidelines: https://github.com/crash-utility/crash/wiki




[Index of Archives]     [Fedora Development]     [Fedora Desktop]     [Fedora SELinux]     [Yosemite News]     [KDE Users]     [Fedora Tools]

 

Powered by Linux