Hello, Kazu 2022년 3월 3일 (목) 오후 4:50, HAGIO KAZUHITO(萩尾 一仁) <k-hagio-ab@xxxxxxx>님이 작성: > > Hi Austin, > > -----Original Message----- > > > > This is because output of "ps -l|-m" depends on task_struct.sched_info.last_arrival. > > > > > > > > Without CONFIG_SCHEDSTATS or CONFIG_SCHED_INFO, 'sched_info' field is not included > > > > in task_struct. > > > > > > > > So we make "ps -l|-m" option to access 'exec_start' field of sched_entity > > > > where 'exec_start' is task_struct.se.exec_start. > > > > > > Could you describe what the exec_start means? When is it updated? > > > > > > > The 'task_struct.se.exec_start' contains the most recently-executed > > timestamp when > > process is running in the below cases; > > > > - enqueued to runqueue > > - dequeued from ruqueue > > - scheduler tick is invoked > > - etc > > > > So I guess 'task_struct.se.exec_start' could be one of statistics > > which indicates > > the most recently run timestamp of process activity. > > > > From CFS scheduler's point of view, 'task_struct.se.exec_start' is > > updated within update_curr() > > where its call path is various as below. > > > > - enqueue_task_fair, -dequeue_task_fair, task_tick_fair, > > check_preempt_wakeup, ... > > Thank you for looking into this. > > As for when the se.exec_start is updated, I think you are right. > (with my understanding, probably it's the last time or tick when a task > is in a runqueue regardless of getting a CPU.) > > But I found a problem, is that the se.exec_start is from rq->clock_task, > not from rq->clock like last_arrival. The rq->clock_task may not contain > irq/steal time, please see update_rq_clock_task(). > I am understood after looking at below code. static void sched_info_arrive(struct rq *rq, struct task_struct *t) { ... now = rq_clock(rq); //<<-- ... t->sched_info.last_arrival = now; } static void update_curr(struct cfs_rq *cfs_rq) { u64 now = rq_clock_task(rq_of(cfs_rq)); //<<-- ... curr->exec_start = now; ... } > This causes the following issues with a vmcore generated on a machine, > which had run for a while (273 days): > Let me ask a question at this point. The vmcore was generated from machine which has been running 273 days. I wonder the kernel image of machine(running 273 days) is without CONFIG_SCHEDSTATS/CONFIG_SCHED_INFO. If output is based on vmcore without CONFIG_SCHEDSTATS or CONFIG_SCHED_INFO, you can ignore above question. I would like to remind that as commit description said, the patch is **only** valid without CONFIG_SCHEDSTATS or CONFIG_SCHED_INFO. When kernel image is with CONFIG_SCHEDSTATS or CONFIG_SCHED_INFO and then "ps -l|-m" command is used, task_last_run() is called from its subroutines. ulonglong task_last_run(ulong task) { ulong last_run; ... else if (VALID_MEMBER(sched_info_last_arrival)) // With CONFIG_SCHEDSTATS or CONFIG_SCHED_INFO timestamp = tt->last_task_read ? ULONGLONG(tt->task_struct + OFFSET(task_struct_sched_info) + OFFSET(sched_info_last_arrival)) : 0; + else if (VALID_MEMBER(sched_entity_exec_start)) // Without CONFIG_SCHEDSTATS or CONFIG_SCHED_INFO + timestamp = tt->last_task_read ? ULONGLONG(tt->task_struct + + OFFSET(task_struct_se) + + OFFSET(sched_entity_exec_start)) : 0; return timestamp; } With CONFIG_SCHEDSTATS or CONFIG_SCHED_INFO enabled, the statement under 'VALID_MEMBER(sched_entity_exec_start)' is not called. > crash> ps -m | head > [ 0 00:59:36.582] [RU] PID: 4023608 TASK: ffff916f7c6b1840 CPU: 15 COMMAND: "makedumpfile" > ^^^^^^^^^(1) > [ 0 00:59:37.831] [ID] PID: 413 TASK: ffff916f772d3080 CPU: 15 COMMAND: "kworker/15:1H" > [ 0 00:59:39.765] [IN] PID: 3929504 TASK: ffff916f5f0748c0 CPU: 15 COMMAND: "respawn_actlog" > [ 0 00:59:40.650] [IN] PID: 1974 TASK: ffff91647dc53080 CPU: 15 COMMAND: "CPU 2/KVM" > [ 0 00:59:41.925] [IN] PID: 1297 TASK: ffff916f63c46100 CPU: 15 COMMAND: "NetworkManager" > [ 0 00:59:42.944] [ID] PID: 3763057 TASK: ffff9160c4519840 CPU: 15 COMMAND: "kworker/15:0" > [ 0 00:59:42.944] [IN] PID: 101 TASK: ffff916040c91840 CPU: 15 COMMAND: "migration/15" > [ 0 00:59:43.078] [IN] PID: 100 TASK: ffff916040c5b080 CPU: 15 COMMAND: "watchdog/15" > [ 0 00:59:47.533] [IN] PID: 1292 TASK: ffff916f63c43080 CPU: 15 COMMAND: "lsmd" > [ 0 00:59:49.089] [IN] PID: 113105 TASK: ffff9160412248c0 CPU: 15 COMMAND: "kvm-nx-lpage-re" > ^^^^^^^(2) > (1) large difference from zero > (2) large differences among CPUs (probably due to the differences of irq time) > > (1) might be solved with rq->clock_task, but (2) will be misleading and confusing. > So currently I'm thinking that the "ps -l|-m" options should not use the se.exec_start. > What do you think? > Agreed. (As you indicated) Since rq->clock cannot hold accurate timestamp we can rely on, "ps -l|-m" options should not access the se.exec_start in case of kernel image without CONFIG_SCHEDSTATS or CONFIG_SCHED_INFO. Thanks for your code review. If there is something to improve, let me propose a patch again. BR, Austin Kim > Thanks, > Kazu > -- Crash-utility mailing list Crash-utility@xxxxxxxxxx https://listman.redhat.com/mailman/listinfo/crash-utility Contribution Guidelines: https://github.com/crash-utility/crash/wiki