Hi Kazu, On Wed, Mar 30, 2022 at 09:27:18AM +0000, HAGIO KAZUHITO(萩尾 一仁) wrote: > -----Original Message----- > > Hi Kazu, > > On Wed, Mar 30, 2022 at 08:28:19AM +0000, HAGIO KAZUHITO(萩尾 一仁) wrote: > > > -----Original Message----- > > > > 1.) The vmcore file maybe very big. > > > > > > > > For example, I have a vmcore file which is over 23G, > > > > and the panic kernel had 767.6G memory, > > > > its max_sect_len is 4468736. > > > > > > > > Current code costs too much time to do the following loop: > > > > .............................................. > > > > for (i = 1; i < max_sect_len + 1; i++) { > > > > dd->valid_pages[i] = dd->valid_pages[i - 1]; > > > > for (j = 0; j < BITMAP_SECT_LEN; j++, pfn++) > > > > if (page_is_dumpable(pfn)) > > > > dd->valid_pages[i]++; > > > > .............................................. > > > > > > > > For my case, it costs about 56 seconds to finish the > > > > big loop. > > > > > > > > This patch moves the hweightXX macros to defs.h, > > > > and uses hweight64 to optimize the loop. > > > > > > > > For my vmcore, the loop only costs about one second now. > > > > > > > > 2.) Tests result: > > > > # cat ./commands.txt > > > > quit > > > > > > > > Before: > > > > > > > > #echo 3 > /proc/sys/vm/drop_caches; > > > > #time ./crash -i ./commands.txt /root/t/vmlinux /root/t/vmcore > /dev/null 2>&1 > > > > ............................ > > > > real 1m54.259s > > > > user 1m12.494s > > > > sys 0m3.857s > > > > ............................ > > > > > > > > After this patch: > > > > > > > > #echo 3 > /proc/sys/vm/drop_caches; > > > > #time ./crash -i ./commands.txt /root/t/vmlinux /root/t/vmcore > /dev/null 2>&1 > > > > ............................ > > > > real 0m55.217s > > > > user 0m15.114s > > > > sys 0m3.560s > > > > ............................ > > > > > > Thank you for the improvement! > > > > > > as far as I tested on x86_64 it did not give such a big gain, but looking at > > > the user time, it will do on arm64. Lianbo, can you reproduce on arm64? > > > > > > with a 192GB x86_64 dumpfile, slightly improved: > > > > > > $ time echo quit | ./crash vmlinux dump >/dev/null > > > > > > real 0m5.632s > > Thanks for the testing. > > > > I am curious why it costs only 5.632s for a 192G dumpfile? > > How much memory of the panic kernel in the dumpfile? > > > > My vmcore has 767.G memory, and the max_sect_len is 4468736. > > I got it with makedumpfile -d 0 and tested it without dropping caches > to measure the change of the loop cost. As for memory, which size > are you saying? That machine has 192GB memory. > > $ ls -lhs dump > 193G -rw-------. 1 root root 193G Mar 30 17:07 dump > $ file dump > dump: Kdump compressed dump v6, system Linux, ... > > $ ./crash vmlinux dump > > MEMORY: 191.7 GB > > crash> help -D > ... > block_size: 4096 > sub_hdr_size: 10 > bitmap_blocks: 3088 > max_mapnr: 50593791 > ... > total_valid_pages: 50178690 > max_sect_len: 12352 // added Ok, it seems your max_sect_len is too small. > > The max_sect_len looks too small comparing yours.. but > 12352 * 4096 = 50593792 My max_sect_len is 4468736, so 4468736 / 12352 = 361.78 The (4468736 * 4096) costs 56s on my machine. Assume our CPU runs at the same speed, your machine will costs (56/361.78 = 0.1547)s. So you cannot get big gain. :) Thanks Huang Shijie -- Crash-utility mailing list Crash-utility@xxxxxxxxxx https://listman.redhat.com/mailman/listinfo/crash-utility Contribution Guidelines: https://github.com/crash-utility/crash/wiki