[Crash-utility] Re: [PATCH] "kmem -i" extremely slow on dumps from large memory systems

Tao Liu <ltao@xxxxxxxxxx> · Thu, 12 Dec 2024 15:41:59 +1300

Hi Aureau,

Thanks for your improvements.

Please check your patch, I guess there are formatting issues, so I
cannot apply it directly by command, but only copy & paste code hunk
manually...

On Mon, Dec 9, 2024 at 7:50 PM Aureau, Georges (Kernel Tools ERT)
<georges.aureau@xxxxxxx> wrote:
>
> The “kmem -i” command is extremely slow (appears to hang) on dumps from large memory systems.
>
>
>
> For example, on 120GB crash dump from a high-end server with 32TB of RAM (ie. 8Giga Pages),
>
> the “kmem -i” command is taking over 50 minutes to execute on a DL380 Gen10. To report basic
>
> general memory usage figures, we should only be reading global counters, without having to walk
>
> the full flat/sparse mem map page table. Hence, dump_kmeminfo() should first be reading globals,
>
> and then only call dump_mem_map() if important information (ie. slabs or total ram) is missing.
>

After applying your patch, I see the "SHARED  3942872        15 GB
23% of TOTAL MEM" line is missing of my "kmem -i" outputs. Though the
test case I use doesn't have the performance issue as you mentioned.
So my question is, is it necessary for the output change for the
non-performance influenced vmcores?

In addition, I see your approach for improving the performance is to
bypass the dump_mem_map() calling. Can we get dump_mem_map() optimized
instead, so we can still call dump_mem_map() and leave the output
unchanged? E.g. if the performance bottleneck is due to too much
memory pages reading, can we skip some of unnecessary page reading, or
increase memory cache to decrease memory page decompressing or else?
Frankly you didn't point where exactly the performance bottleneck is
in your patch.

Thanks,
Tao Liu

>
>
> Signed-off-by: Georges Aureau <georges.aureau@xxxxxxx>
>
> --
>
> diff --git a/memory.c b/memory.c
>
> index 8c01ed0..8e2b2e9 100644
>
> --- a/memory.c
>
> +++ b/memory.c
>
> @@ -8546,18 +8546,19 @@ dump_kmeminfo(void)
>
>          ulong get_buffers;
>
>          ulong get_slabs;
>
>              char buf[BUFSIZE];
>
> +            ulong flags;
>
> -
>
> -             BZERO(&meminfo, sizeof(struct meminfo));
>
> -             meminfo.flags = GET_ALL;
>
> -             dump_mem_map(&meminfo);
>
> -             get_totalram = meminfo.get_totalram;
>
> -             shared_pages = meminfo.get_shared;
>
> -             get_buffers = meminfo.get_buffers;
>
> -             get_slabs = meminfo.get_slabs;
>
> +            /*
>
> +            * By default, we will no longer call dump_mem_map() as this is too
>
> +            * slow for large memory systems. If we have to call it (eg. missing
>
> +            * important information such as slabs or total ram), we will also
>
> +            * collect shared pages. Otherwise, we won't print shared pages.
>
> +            */
>
> +            flags = GET_SHARED_PAGES;
>
> +            shared_pages = 0;
>
>               /*
>
> -             *  If vm_stat array exists, override page search info.
>
> +            *  If vm_stat array does not exists, then set mem map flag.
>
>               */
>
>              if (vm_stat_init()) {
>
>                             if (dump_vm_stat("NR_SLAB", &nr_slab, 0))
>
> @@ -8571,10 +8572,11 @@ dump_kmeminfo(void)
>
>                                           get_slabs = nr_slab;
>
>                                           if (dump_vm_stat("NR_SLAB_UNRECLAIMABLE_B", &nr_slab, 0))
>
>                                                          get_slabs += nr_slab;
>
> +                           } else {
>
> +                                         flags |= GET_SLAB_PAGES;
>
>                             }
>
>              }
>
> -             fprintf(fp, "%s", kmeminfo_hdr);
>
>              /*
>
>               *  Get total RAM based upon how the various versions of si_meminfo()
>
>           *  have done it, latest to earliest:
>
> @@ -8586,9 +8588,32 @@ dump_kmeminfo(void)
>
>                  symbol_exists("_totalram_pages")) {
>
>                             totalram_pages = vt->totalram_pages ?
>
>                                            vt->totalram_pages : get_totalram;
>
> -             } else
>
> -                           totalram_pages = get_totalram;
>
> +            } else {
>
> +                           flags |= GET_TOTALRAM_PAGES;
>
> +            }
>
> +            /*
>
> +            * If we want more than just shared pages (missing slabs or total ram)
>
> +            * then call dump_mem_map() and also collect buffers.
>
> +            */
>
> +            if (flags != GET_SHARED_PAGES) {
>
> +                           BZERO(&meminfo, sizeof(struct meminfo));
>
> +                           flags |= GET_BUFFERS_PAGES;
>
> +                           meminfo.flags = flags;
>
> +                           dump_mem_map(&meminfo);
>
> +                           /* Update the missing information */
>
> +                           if (flags & GET_SLAB_PAGES) {
>
> +                                         get_slabs = meminfo.get_slabs;
>
> +                           }
>
> +                           if (flags & GET_TOTALRAM_PAGES) {
>
> +                                         get_totalram = meminfo.get_totalram;
>
> +                                         totalram_pages = get_totalram;
>
> +                           }
>
> +                           shared_pages = meminfo.get_shared;
>
> +                           get_buffers = meminfo.get_buffers;
>
> +            }
>
> +
>
> +            fprintf(fp, "%s", kmeminfo_hdr);
>
>              fprintf(fp, "%13s  %7ld  %11s         ----\n", "TOTAL MEM",
>
>                              totalram_pages, pages_to_size(totalram_pages, buf));
>
> @@ -8613,9 +8638,11 @@ dump_kmeminfo(void)
>
>           *  differently than the kernel -- it just tallies the non-reserved
>
>           *  pages that have a count of greater than 1.
>
>               */
>
> -        pct = (shared_pages * 100)/totalram_pages;
>
> -        fprintf(fp, "%13s  %7ld  %11s  %3ld%% of TOTAL MEM\n",
>
> -                           "SHARED", shared_pages, pages_to_size(shared_pages, buf), pct);
>
> +            if (shared_pages) {
>
> +            pct = (shared_pages * 100)/totalram_pages;
>
> +            fprintf(fp, "%13s  %7ld  %11s  %3ld%% of TOTAL MEM\n",
>
> +                                         "SHARED", shared_pages, pages_to_size(shared_pages, buf), pct);
>
> +            }
>
>               subtract_buffer_pages = 0;
>
>              if (symbol_exists("buffermem_pages")) {
>
> @@ -8634,7 +8661,7 @@ dump_kmeminfo(void)
>
>          fprintf(fp, "%13s  %7ld  %11s  %3ld%% of TOTAL MEM\n",
>
>                              "BUFFERS", buffer_pages, pages_to_size(buffer_pages, buf), pct);
>
> -             if (CRASHDEBUG(1))
>
> +            if (CRASHDEBUG(1)&&(flags&GET_BUFFERS_PAGES))
>
>               error(NOTE, "pages with buffers: %ld\n", get_buffers);
>
>               /*
>
>
>
> --
> Crash-utility mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxxxxxx
> To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxxxxxx
> https://${domain_name}/admin/lists/devel.lists.crash-utility.osci.io/
> Contribution Guidelines: https://github.com/crash-utility/crash/wiki
--
Crash-utility mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxxxxxx
https://${domain_name}/admin/lists/devel.lists.crash-utility.osci.io/
Contribution Guidelines: https://github.com/crash-utility/crash/wiki