On 2023/08/24 17:29, lijiang wrote: > On Thu, Aug 24, 2023 at 10:01 AM HAGIO KAZUHITO(萩尾 一仁) <k-hagio-ab@xxxxxxx> > wrote: > >> On 2023/08/23 14:44, Lianbo Jiang wrote: >>> When a task is exiting, usually kernel marks its flags as 'PF_EXITING', >>> but even so, sometimes the mm_struct has not been freed, it might still >>> be valid. For such tasks, the "ps/vm" commands won't display the memory >>> usage. For example: >>> >>> crash> ps 47070 >>> PID PPID CPU TASK ST %MEM VSZ RSS >> COMM >>> 47070 1 0 ffff9ba7c4910000 UN 0.0 0 0 >> ra_ris.parse >>> crash> vm 47070 >>> PID: 47070 TASK: ffff9ba7c4910000 CPU: 0 COMMAND: >> "ra_ris.parse" >>> MM PGD RSS TOTAL_VM >>> 0 0 0k 0k >>> >>> To be honest, this is a corner case, but it has already occurred in >>> actual production environments. Given that, let's allow the "ps/vm" >>> commands to try to display the memory usage for this case, but it does >>> not guarantee that it can work well at any time, which still depends on >>> how far the mm_struct deconstruction has proceeded. >> >> Agree to display it, and looks like the deconstruction is done after >> task->mm is set to NULL, so it looks fine to me. >> >> > Thank you for the comments, Kazu. > > >> void __noreturn do_exit(long code) >> { >> ... >> exit_signals(tsk); /* sets PF_EXITING */ >> ... >> exit_mm(); >> >> static void exit_mm(void) >> { >> struct mm_struct *mm = current->mm; >> ... >> current->mm = NULL; ## task->mm is set to NULL here >> ... >> mmput(mm); ## release the resources actually >> >> >> On the other hand, the mm->mm_count is decremented in mmput(), is there >> need to check it? >> >> > Good question. I had the same thoughts, but finally I chose to double check > with the mm_count. Not sure how to ensure the memory synchronization can be > done, when the kernel is panicking. Ok, it's fine to double check just in case, then ... > > If the address of mm pointer is valid and the mm_struct members are always > legitimate, we won't need to double check. > > But anyway, this is just my thoughts, maybe it's not correct completely. If > you do not want to have it, I can post v2 and simply remove > the IS_EXITING(task) from get_task_mem_usage(). > > Thanks. > Lianbo > > >> void mmput(struct mm_struct *mm) >> { >> might_sleep(); >> >> if (atomic_dec_and_test(&mm->mm_users)) >> __mmput(mm); >> } >> >> static inline void __mmput(struct mm_struct *mm) >> { >> ... >> exit_mmap(mm); >> ... >> mmdrop(mm); >> } >> >> static inline void mmdrop(struct mm_struct *mm) >> { >> ... >> if (unlikely(atomic_dec_and_test(&mm->mm_count))) >> __mmdrop(mm); >> } >> >> Thanks, >> Kazu >> >>> >>> With the patch: >>> crash> ps 47070 >>> PID PPID CPU TASK ST %MEM VSZ RSS >> COMM >>> 47070 1 0 ffff9ba7c4910000 UN 90.8 38461228 31426444 >> ra_ris.parse >>> crash> vm 47070 >>> PID: 47070 TASK: ffff9ba7c4910000 CPU: 0 COMMAND: >> "ra_ris.parse" >>> MM PGD RSS TOTAL_VM >>> ffff9bad6e873840 ffff9baee0544000 31426444k 38461228k >>> VMA START END FLAGS FILE >>> ffff9bafdbe1d6c8 400000 8c5000 8000875 >> /data1/rishome/ra_cu_cn_412/sbin/ra_ris.parse >>> ... >>> >>> Reported-by: Buland Kumar Singh <bsingh@xxxxxxxxxx> >>> Signed-off-by: Lianbo Jiang <lijiang@xxxxxxxxxx> >>> --- >>> memory.c | 13 +++++++++++-- >>> 1 file changed, 11 insertions(+), 2 deletions(-) >>> >>> diff --git a/memory.c b/memory.c >>> index 5d76c5d7fe6f..7d59c0555a0e 100644 >>> --- a/memory.c >>> +++ b/memory.c >>> @@ -4792,10 +4792,12 @@ get_task_mem_usage(ulong task, struct >> task_mem_usage *tm) >>> { >>> struct task_context *tc; >>> long rss = 0, rss_cache = 0; >>> + int mm_count = 0; >>> + ulong addr; >>> >>> BZERO(tm, sizeof(struct task_mem_usage)); >>> >>> - if (IS_ZOMBIE(task) || IS_EXITING(task)) >>> + if (IS_ZOMBIE(task)) >>> return; >>> >>> tc = task_to_context(task); >>> @@ -4805,7 +4807,14 @@ get_task_mem_usage(ulong task, struct >> task_mem_usage *tm) >>> >>> tm->mm_struct_addr = tc->mm_struct; >>> >>> - if (!task_mm(task, TRUE)) >>> + if (!(addr = task_mm(task, TRUE))) >>> + return; >>> + >>> + if (!readmem(addr + OFFSET(mm_struct_mm_count), KVADDR, &mm_count, >>> + sizeof(int), "mm_struct mm_count", RETURN_ON_ERROR)) >>> + return; tt->mm_struct is filled in task_mm(), I think we can use this: mm_count = INT(tt->mm_struct + OFFSET(mm_struct_mm_count)); The following code in get_task_mem_usage() also use this. Thanks, Kazu -- Crash-utility mailing list Crash-utility@xxxxxxxxxx https://listman.redhat.com/mailman/listinfo/crash-utility Contribution Guidelines: https://github.com/crash-utility/crash/wiki