Re: [RFC/PATCHSET 0/9] perf record: Implement BPF sample filter (v4)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Ravi,

On Fri, Mar 10, 2023 at 12:10:28PM +0530, Ravi Bangoria wrote:
> Hi Namhyung,
> 
> Sorry, I should have tried earlier prototypes but missed it.

No worries and thanks for your review!

> 
> > Maybe more useful example is when it deals with precise memory events.
> > On AMD processors with IBS, you can filter only memory load with L1
> > dTLB is missed like below.
> > 
> >   $ sudo ./perf record -ad -e ibs_op//p \
> >   > --filter 'mem_op == load, mem_dtlb > l1_hit' sleep 1
> >   [ perf record: Woken up 1 times to write data ]
> >   [ perf record: Captured and wrote 1.338 MB perf.data (15 samples) ]
> 
> On my zen4 machine:
> 
>   $ sudo ./perf record -d -e ibs_op//p --filter 'mem_op == load' -c 100000 ~/test
>   [ perf record: Woken up 6 times to write data ]
>   [ perf record: Captured and wrote 1.436 MB perf.data (30966 samples) ]
> 
>   $ sudo ./perf mem report -F sample,mem --stdio
>   #      Samples  Memory access
>   # ............  ........................
>            30325  L1 hit
>              477  Local RAM hit
>               89  L2 hit
>               75  L3 hit
> 
> This looks good because IBS hw can't filter specific type of instruction
> and thus unfiltered data will contain "NA" types of memory accesses, which
> is absent here. So mem_op == load filter seems to be working.

Good!

> 
> However, if I add "mem_lvl == l1" (or l2 / ram) in the filter, I see mostly
> all samples are getting lost:
> 
>   $ sudo ./perf record -d -e ibs_op//p --filter 'mem_op == load, mem_lvl == l1' -c 100000 ~/test
>   [ perf record: Woken up 1 times to write data ]
>   [ perf record: Captured and wrote 0.019 MB perf.data ]
> 
>   $ sudo ./perf report --stat | grep SAMPLE
>     LOST_SAMPLES events:          1  ( 0.8%)
>     LOST_SAMPLES events:     136332
> 
> What am I missing?

It seems IBS PMU doesn't set the mem_lvlnum field in the data source.
As I said in the patch 7, 'mem_lvl' actually uses mem_lvlnum fields
instead of mem_lvl because it's preferred according to the comment in
the UAPI header.

/*
 * PERF_MEM_LVL_* namespace being depricated to some extent in the
 * favour of newer composite PERF_MEM_{LVLNUM_,REMOTE_,SNOOPX_} fields.
 * Supporting this namespace inorder to not break defined ABIs.
 *
 * memory hierarchy (memory level, hit or miss)
 */

I'll post a patch to set it separately.

> 
> 2nd observation, invalid expressions like 'mem_op == load, mem_dtlb == l1'
> are not failing, instead recording misleading data:
> 
>   $ sudo ./perf record -d -e ibs_op//p --filter 'mem_op == load, mem_dtlb == l1' -c 100000 ~/test
>   [ perf record: Woken up 1 times to write data ]
>   [ perf record: Captured and wrote 0.047 MB perf.data (614 samples) ]
> 
>   $ sudo ./perf script -F data_src | grep "TLB N/A" | wc -l
>   614
 
Good point, that's the limitation in the current implementation.
I think it needs to keep the target sample field along with the
constant so that it can detect unintended uses.  Let's me think
about it more.

Thanks,
Namhyung



[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux