Hi again, Ganapat, Thanks for the quick reply. On Fri, Jun 28, 2019 at 11:09:33AM +0530, Ganapatrao Kulkarni wrote: > On Thu, Jun 27, 2019 at 3:27 PM Will Deacon <will@xxxxxxxxxx> wrote: > > On Fri, Jun 14, 2019 at 05:42:46PM +0000, Ganapatrao Kulkarni wrote: > > > CCPI2 is a low-latency high-bandwidth serial interface for connecting > > > ThunderX2 processors. This patch adds support to capture CCPI2 perf events. > > > > > > Signed-off-by: Ganapatrao Kulkarni <gkulkarni@xxxxxxxxxxx> > > > --- > > > drivers/perf/thunderx2_pmu.c | 179 ++++++++++++++++++++++++++++++----- > > > 1 file changed, 157 insertions(+), 22 deletions(-) > > > > > > diff --git a/drivers/perf/thunderx2_pmu.c b/drivers/perf/thunderx2_pmu.c > > > index 43d76c85da56..3791ac9b897d 100644 > > > --- a/drivers/perf/thunderx2_pmu.c > > > +++ b/drivers/perf/thunderx2_pmu.c > > > @@ -16,7 +16,9 @@ > > > * they need to be sampled before overflow(i.e, at every 2 seconds). > > > */ > > > > > > -#define TX2_PMU_MAX_COUNTERS 4 > > > +#define TX2_PMU_DMC_L3C_MAX_COUNTERS 4 > > > > I find it odd that you rename this... > > i am not sure, how to avoid this since dmc/l3c have 4 counters and ccpi2 has 8. > i will try to make this better in v2. > > > > > +#define TX2_PMU_CCPI2_MAX_COUNTERS 8 > > > + > > > #define TX2_PMU_DMC_CHANNELS 8 > > > #define TX2_PMU_L3_TILES 16 > > > > > > @@ -28,11 +30,22 @@ > > > */ > > > #define DMC_EVENT_CFG(idx, val) ((val) << (((idx) * 8) + 1)) > > > > > > +#define GET_EVENTID_CCPI2(ev) ((ev->hw.config) & 0x1ff) > > > +/* bits[3:0] to select counters, starts from 8, bit[3] set always. */ > > > +#define GET_COUNTERID_CCPI2(ev) ((ev->hw.idx) & 0x7) > > > +#define CCPI2_COUNTER_OFFSET 8 > > > > > > ... but leave GET_EVENTID alone, even though it only applies to DMC/L3C > > events. Saying that, if you have the event you can figure out its type, > > so could you avoid the need for additional macros entirely and just use > > the correct masks based on the corresponding PMU type? That might also > > avoid some of the conditional control flow you're introducing elsewhere. > > sure, i will add mask as argument to macro. > > > > > #define L3C_COUNTER_CTL 0xA8 > > > #define L3C_COUNTER_DATA 0xAC > > > #define DMC_COUNTER_CTL 0x234 > > > #define DMC_COUNTER_DATA 0x240 > > > > > > +#define CCPI2_PERF_CTL 0x108 > > > +#define CCPI2_COUNTER_CTL 0x10C > > > +#define CCPI2_COUNTER_SEL 0x12c > > > +#define CCPI2_COUNTER_DATA_L 0x130 > > > +#define CCPI2_COUNTER_DATA_H 0x134 > > > + > > > /* L3C event IDs */ > > > #define L3_EVENT_READ_REQ 0xD > > > #define L3_EVENT_WRITEBACK_REQ 0xE > > > @@ -51,9 +64,16 @@ > > > #define DMC_EVENT_READ_TXNS 0xF > > > #define DMC_EVENT_MAX 0x10 > > > > > > +#define CCPI2_EVENT_REQ_PKT_SENT 0x3D > > > +#define CCPI2_EVENT_SNOOP_PKT_SENT 0x65 > > > +#define CCPI2_EVENT_DATA_PKT_SENT 0x105 > > > +#define CCPI2_EVENT_GIC_PKT_SENT 0x12D > > > +#define CCPI2_EVENT_MAX 0x200 > > > + > > > enum tx2_uncore_type { > > > PMU_TYPE_L3C, > > > PMU_TYPE_DMC, > > > + PMU_TYPE_CCPI2, > > > PMU_TYPE_INVALID, > > > }; > > > > > > @@ -73,8 +93,8 @@ struct tx2_uncore_pmu { > > > u32 max_events; > > > u64 hrtimer_interval; > > > void __iomem *base; > > > - DECLARE_BITMAP(active_counters, TX2_PMU_MAX_COUNTERS); > > > - struct perf_event *events[TX2_PMU_MAX_COUNTERS]; > > > + DECLARE_BITMAP(active_counters, TX2_PMU_CCPI2_MAX_COUNTERS); > > > + struct perf_event *events[TX2_PMU_DMC_L3C_MAX_COUNTERS]; > > > > Hmm, that looks very odd. Why are they now different sizes? > > events[ ] is used to hold reference to active events to use in timer > callback, which is not applicable to ccpi2, hence 4. > active_counters is set to max of both. i.e, 8. i will try to make it > better readable in v2. Thanks. I suspect renaming the field would help a lot, or perhaps reworking your data structures so that you have a union of ccpi2 and dmc/l2c structures where necessary. > > > struct device *dev; > > > struct hrtimer hrtimer; > > > const struct attribute_group **attr_groups; > > > @@ -92,7 +112,21 @@ static inline struct tx2_uncore_pmu *pmu_to_tx2_pmu(struct pmu *pmu) > > > return container_of(pmu, struct tx2_uncore_pmu, pmu); > > > } > > > > > > -PMU_FORMAT_ATTR(event, "config:0-4"); > > > +#define TX2_PMU_FORMAT_ATTR(_var, _name, _format) \ > > > +static ssize_t \ > > > +__tx2_pmu_##_var##_show(struct device *dev, \ > > > + struct device_attribute *attr, \ > > > + char *page) \ > > > +{ \ > > > + BUILD_BUG_ON(sizeof(_format) >= PAGE_SIZE); \ > > > + return sprintf(page, _format "\n"); \ > > > +} \ > > > + \ > > > +static struct device_attribute format_attr_##_var = \ > > > + __ATTR(_name, 0444, __tx2_pmu_##_var##_show, NULL) > > > + > > > +TX2_PMU_FORMAT_ATTR(event, event, "config:0-4"); > > > +TX2_PMU_FORMAT_ATTR(event_ccpi2, event, "config:0-9"); > > > > This doesn't look right. Can perf deal with overlapping fields? It seems > > wrong that we'd allow the user to specify both, for example. > > I am not sure what is the issue here? both are different PMUs > root@SBR-26> cat /sys/bus/event_source/devices/uncore_dmc_0/format/event > config:0-4 > root@SBR-26> cat /sys/bus/event_source/devices/uncore_ccpi2_0/format/event > config:0-9 Ah, sorry about that. I got _var and _name the wrong way around and thought you were introducing a file called event_ccpi2! What you have looks fine. Will