Hi Will,
Hmm, I was hoping that you would expose all the events as proper perf_events
and get rid of the subevents entirely.
Then userspace could do things like:
// Count number of RX memory reads
$ perf stat -e hisi_pcie0_0/rx_memory_read/
// Count delay cycles
$ perf stat -e hisi_pcie0_0/latency/
// Count both of the above (events must be in the same group)
$ perf stat -g -e hisi_pcie0_0/latency/ -e hisi_pcie0_0/rx_memory_read/
Note that in all three of these cases the hardware will be programmed in
the same way and both HISI_PCIE_CNT and HISI_PCIE_EXT_CNT are allocated!
So for example, doing this (i.e. without the '-g'):
$ perf stat -e hisi_pcie0_0/latency/ -e hisi_pcie0_0/rx_memory_read/
would fail because the first event would allocate both of the counters.
I'm confused with this situation when getting rid of subevent:
$ perf stat -e hisi_pcie0_0/latency/ -e hisi_pcie0_0/rx_memory_read/
In this case, driver checks the relationship of "latency" and
"rx_memory_read" in pmu->add() function and return a -EINVAL, but this
seems lead to time division multiplexing.
if (event->pmu->add(event, PERF_EF_START)) {
perf_event_set_state(event, PERF_EVENT_STATE_INACTIVE);
event->oncpu = -1;
ret = -EAGAIN;
goto out;
}
...
out:
perf_pmu_enable(event->pmu);
This result doesn't meet our expection, do I miss something here?
How about add an array to record events and check the relationship in
event_init() function? It seems that perf stat could only failed when
driver return invalid value in pmu->event_init() function.
Thanks,
Qi
All you need to do is check the counter scheduling constraints when
accepting an event group in the driver. No need for subevents at all.
Does that make sense?
Will
.