On Wed, Aug 04, 2021 at 03:29:54PM +0800, liuqi (BA) wrote: > > Hi Will, > > Hmm, I was hoping that you would expose all the events as proper perf_events > > and get rid of the subevents entirely. > > > > Then userspace could do things like: > > > > // Count number of RX memory reads > > $ perf stat -e hisi_pcie0_0/rx_memory_read/ > > > > // Count delay cycles > > $ perf stat -e hisi_pcie0_0/latency/ > > > > // Count both of the above (events must be in the same group) > > $ perf stat -g -e hisi_pcie0_0/latency/ -e hisi_pcie0_0/rx_memory_read/ > > > > Note that in all three of these cases the hardware will be programmed in > > the same way and both HISI_PCIE_CNT and HISI_PCIE_EXT_CNT are allocated! > > > > So for example, doing this (i.e. without the '-g'): > > > > $ perf stat -e hisi_pcie0_0/latency/ -e hisi_pcie0_0/rx_memory_read/ > > > > would fail because the first event would allocate both of the counters. > > I'm confused with this situation when getting rid of subevent: > > $ perf stat -e hisi_pcie0_0/latency/ -e hisi_pcie0_0/rx_memory_read/ > > In this case, driver checks the relationship of "latency" and > "rx_memory_read" in pmu->add() function and return a -EINVAL, but this seems > lead to time division multiplexing. > > if (event->pmu->add(event, PERF_EF_START)) { > perf_event_set_state(event, PERF_EVENT_STATE_INACTIVE); > event->oncpu = -1; > ret = -EAGAIN; > goto out; > } > ... > out: > perf_pmu_enable(event->pmu); > > This result doesn't meet our expection, do I miss something here? This is how perf works. If you don't want multiplexing, put the events in a group. What's the problem with that? Will