Re: [PATCH v4] pgo: add clang's Profile Guided Optimization infrastructure

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Jan 13, 2021 at 8:07 PM Nick Desaulniers
<ndesaulniers@xxxxxxxxxx> wrote:
>
> On Wed, Jan 13, 2021 at 12:55 PM Nathan Chancellor
> <natechancellor@xxxxxxxxx> wrote:
> >
> > However, I see an issue with actually using the data:
> >
> > $ sudo -s
> > # mount -t debugfs none /sys/kernel/debug
> > # cp -a /sys/kernel/debug/pgo/profraw vmlinux.profraw
> > # chown nathan:nathan vmlinux.profraw
> > # exit
> > $ tc-build/build/llvm/stage1/bin/llvm-profdata merge --output=vmlinux.profdata vmlinux.profraw
> > warning: vmlinux.profraw: Invalid instrumentation profile data (bad magic)
> > error: No profiles could be merged.
> >
> > Am I holding it wrong? :) Note, this is virtualized, I do not have any
> > "real" x86 hardware that I can afford to test on right now.
>
> Same.
>
> I think the magic calculation in this patch may differ from upstream
> llvm: https://github.com/llvm/llvm-project/blob/49142991a685bd427d7e877c29c77371dfb7634c/llvm/include/llvm/ProfileData/SampleProf.h#L96-L101

Err...it looks like it was the padding calculation.  With that fixed
up, we can query the profile data to get insights on the most heavily
called functions.  Here's what my top 20 are (reset, then watch 10
minutes worth of cat videos on youtube while running `find /` and
sleeping at my desk).  Anything curious stand out to anyone?

$ llvm-profdata show -topn=20 /tmp/vmlinux.profraw
Instrumentation level: IR  entry_first = 0
Total functions: 48970
Maximum function count: 62070879
Maximum internal block count: 83221158
Top 20 functions with the largest internal block counts:
  drivers/tty/n_tty.c:n_tty_write, max count = 83221158
  rcu_read_unlock_strict, max count = 62070879
  _cond_resched, max count = 25486882
  rcu_all_qs, max count = 25451477
  drivers/cpuidle/poll_state.c:poll_idle, max count = 23618576
  _raw_spin_unlock_irqrestore, max count = 18874121
  drivers/cpuidle/governors/menu.c:menu_select, max count = 18721624
  _raw_spin_lock_irqsave, max count = 18509161
  memchr, max count = 15525452
  _raw_spin_lock, max count = 15484254
  __mod_memcg_state, max count = 14604619
  __mod_memcg_lruvec_state, max count = 14602783
  fs/ext4/hash.c:str2hashbuf_signed, max count = 14098424
  __mod_lruvec_state, max count = 12527154
  __mod_node_page_state, max count = 12525172
  native_sched_clock, max count = 8904692
  sched_clock_cpu, max count = 8895832
  sched_clock, max count = 8894627
  kernel/entry/common.c:exit_to_user_mode_prepare, max count = 8289031
  fpregs_assert_state_consistent, max count = 8287198

-- 
Thanks,
~Nick Desaulniers



[Index of Archives]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite Forum]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]     [Linux Resources]

  Powered by Linux