On Thu 18-08-16 13:31:28, Michal Hocko wrote: > From: Michal Hocko <mhocko@xxxxxxxx> > > seq_printf (used by show_smap) can be pretty expensive when dumping a > lot of numbers. Say we would like to get Rss and Pss from a particular > process. In order to measure a pathological case let's generate as many > mappings as possible: > > $ cat max_mmap.c > int main() > { > while (mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_ANON|MAP_SHARED|MAP_POPULATE, -1, 0) != MAP_FAILED) > ; > > printf("pid:%d\n", getpid()); > pause(); > return 0; > } > > $ awk '/^Rss/{rss+=$2} /^Pss/{pss+=$2} END {printf "rss:%d pss:%d\n", rss, pss}' /proc/$pid/smaps > > would do a trick. The whole runtime is in the kernel space which is not > that that unexpected because smaps is not the cheapest one (we have to > do rmap walk etc.). > > Command being timed: "awk /^Rss/{rss+=$2} /^Pss/{pss+=$2} END {printf "rss:%d pss:%d\n", rss, pss} /proc/3050/smaps" > User time (seconds): 0.01 > System time (seconds): 0.44 > Percent of CPU this job got: 99% > Elapsed (wall clock) time (h:mm:ss or m:ss): 0:00.47 > > But the perf says: > 22.55% awk [kernel.kallsyms] [k] format_decode > 14.65% awk [kernel.kallsyms] [k] vsnprintf > 6.40% awk [kernel.kallsyms] [k] number > 2.53% awk [kernel.kallsyms] [k] shmem_mapping > 2.53% awk [kernel.kallsyms] [k] show_smap > 1.81% awk [kernel.kallsyms] [k] lock_acquire > > we are spending most of the time actually generating the output which is > quite lame. Let's replace seq_printf by seq_puts and seq_put_decimal_ull. > This will give us: > Command being timed: "awk /^Rss/{rss+=$2} /^Pss/{pss+=$2} END {printf "rss:%d pss:%d\n", rss, pss} /proc/3067/smaps" > User time (seconds): 0.00 > System time (seconds): 0.41 > Percent of CPU this job got: 99% > Elapsed (wall clock) time (h:mm:ss or m:ss): 0:00.42 > > which will give us ~7% improvement. Perf says: > 28.87% awk [kernel.kallsyms] [k] seq_puts > 5.30% awk [kernel.kallsyms] [k] vsnprintf > 4.54% awk [kernel.kallsyms] [k] format_decode > 3.73% awk [kernel.kallsyms] [k] show_smap > 2.56% awk [kernel.kallsyms] [k] shmem_mapping > 1.92% awk [kernel.kallsyms] [k] number > 1.80% awk [kernel.kallsyms] [k] lock_acquire > 1.75% awk [kernel.kallsyms] [k] print_name_value_kb OK, so it turned out that I was fooled by VIRT_CPU_ACCOUNTING_GEN accounting [1]. So I have replaced it by TICK_CPU_ACCOUNTING and the numbers the seq_printf -> seq_write doesn't seem to be all that much of a win. Before User time (seconds): 0.14 System time (seconds): 0.30 Percent of CPU this job got: 98% Elapsed (wall clock) time (h:mm:ss or m:ss): 0:00.45 19.66% awk [kernel.kallsyms] [k] format_decode 14.25% awk [kernel.kallsyms] [k] vsnprintf 6.42% awk [kernel.kallsyms] [k] number 2.88% awk mawk [.] 0x0000000000006910 2.58% awk [kernel.kallsyms] [k] shmem_mapping 2.12% awk mawk [.] 0x0000000000006918 2.02% awk [kernel.kallsyms] [k] show_smap after: User time (seconds): 0.13 System time (seconds): 0.31 Percent of CPU this job got: 99% Elapsed (wall clock) time (h:mm:ss or m:ss): 0:00.45 23.89% awk [kernel.kallsyms] [k] seq_write 5.84% awk [kernel.kallsyms] [k] vsnprintf 5.08% awk [kernel.kallsyms] [k] format_decode 4.00% awk [kernel.kallsyms] [k] show_val_kb 3.84% awk [kernel.kallsyms] [k] show_smap 2.16% awk [kernel.kallsyms] [k] number 2.05% awk [kernel.kallsyms] [k] shmem_mapping so it is basically in noise. [1] http://lkml.kernel.org/r/20160823143330.GL23577@xxxxxxxxxxxxxx -- Michal Hocko SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>