Re: [PATCH 1/2] hwmon: Add amd_energy driver to report energy counters

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Guenter

On Sat, 11 Apr 2020 at 22:37, Naveen Krishna Ch
<naveenkrishna.ch@xxxxxxxxx> wrote:
>
> Hi Guenter
>
> On Sat, 11 Apr 2020 at 21:18, Guenter Roeck <linux@xxxxxxxxxxxx> wrote:
> >
> > On 4/11/20 5:26 AM, Naveen Krishna Ch wrote:
> > > Hi Guenter
> > >
> > > I would be glad, If you could help me with the following query.
> > > On Fri, 10 Apr 2020 at 21:27, Guenter Roeck <linux@xxxxxxxxxxxx> wrote:
> > >>
> > >> On 4/9/20 8:39 AM, Naveen Krishna Chatradhi wrote:
> > >>> This patch adds hwmon based amd_energy driver support for
> > >>> family 17h processors from AMD.
> > >>>
> > >>> The driver provides following interface to the userspace
> > >>> 1. Reports the per core and per socket energy consumption
> > >>>    via sysfs entries created per core and per socket respectively.
> > >>>     * per core energy consumed via "core_energy%d_input"
> > >>>     * package/socket energy consumed via "sock_energy%d_input".
> > >>> 2. Uses topology_max_packages() to get number of sockets.
> > >>> 3. Uses num_present_cpus() to get the number of CPUs.
> > >>> 4. Reports the energy units via energy_unit sysfs entry
> > >>>
> > >>> Cc: Guenter Roeck <linux@xxxxxxxxxxxx>
> > >>> Signed-off-by: Naveen Krishna Chatradhi <nchatrad@xxxxxxx>
> > >>> ---
> > >>>  drivers/hwmon/Kconfig      |  10 ++
> > >>>  drivers/hwmon/Makefile     |   1 +
> > >>>  drivers/hwmon/amd_energy.c | 273 +++++++++++++++++++++++++++++++++++++++++++++
> > >>>  3 files changed, 284 insertions(+)
> > >>>  create mode 100644 drivers/hwmon/amd_energy.c
> > >>>
> > >>> diff --git a/drivers/hwmon/Kconfig b/drivers/hwmon/Kconfig
> > >>> index 05a30832c6ba..d83f1403b429 100644
> > >>> --- a/drivers/hwmon/Kconfig
> > >>> +++ b/drivers/hwmon/Kconfig
> > >>> @@ -324,6 +324,16 @@ config SENSORS_FAM15H_POWER
> > >>>         This driver can also be built as a module. If so, the module
> > >>>         will be called fam15h_power.
> > >>>
> > >>> +config SENSORS_AMD_ENERGY
> > >>> +     tristate "AMD RAPL MSR based Energy driver"
> > >>> +     depends on X86
> > >>> +     help
> > >>> +       If you say yes here you get support for core and package energy
> > >>> +       sensors, based on RAPL MSR for AMD family 17h and above CPUs.
> > >>> +
> > >>> +       This driver can also be built as a module. If so, the module
> > >>> +       will be called as amd_energy.
> > >>> +
> > >>>  config SENSORS_APPLESMC
> > >>>       tristate "Apple SMC (Motion sensor, light sensor, keyboard backlight)"
> > >>>       depends on INPUT && X86
> > >>> diff --git a/drivers/hwmon/Makefile b/drivers/hwmon/Makefile
> > >>> index b0b9c8e57176..318f89dc7133 100644
> > >>> --- a/drivers/hwmon/Makefile
> > >>> +++ b/drivers/hwmon/Makefile
> > >>> @@ -45,6 +45,7 @@ obj-$(CONFIG_SENSORS_ADT7411)       += adt7411.o
> > >>>  obj-$(CONFIG_SENSORS_ADT7462)        += adt7462.o
> > >>>  obj-$(CONFIG_SENSORS_ADT7470)        += adt7470.o
> > >>>  obj-$(CONFIG_SENSORS_ADT7475)        += adt7475.o
> > >>> +obj-$(CONFIG_SENSORS_AMD_ENERGY) += amd_energy.o
> > >>>  obj-$(CONFIG_SENSORS_APPLESMC)       += applesmc.o
> > >>>  obj-$(CONFIG_SENSORS_ARM_SCMI)       += scmi-hwmon.o
> > >>>  obj-$(CONFIG_SENSORS_ARM_SCPI)       += scpi-hwmon.o
> > >>> diff --git a/drivers/hwmon/amd_energy.c b/drivers/hwmon/amd_energy.c
> > >>> new file mode 100644
> > >>> index 000000000000..ddb7073ae39b
> > >>> --- /dev/null
> > >>> +++ b/drivers/hwmon/amd_energy.c
> > >>> @@ -0,0 +1,273 @@
> > >>> +// SPDX-License-Identifier: GPL-2.0-only
> > >>> +
> > >>> +/*
> > >>> + * Copyright (C) 2019 Advanced Micro Devices, Inc.
> > >>> + */
> > >>> +
> > >>> +#include <linux/kernel.h>
> > >>> +#include <linux/module.h>
> > >>> +#include <linux/slab.h>
> > >>> +#include <linux/types.h>
> > >>> +#include <linux/device.h>
> > >>> +#include <linux/sysfs.h>
> > >>> +#include <linux/cpu.h>
> > >>> +#include <linux/list.h>
> > >>> +#include <linux/hwmon.h>
> > >>> +#include <linux/hwmon-sysfs.h>
> > >>> +#include <linux/processor.h>
> > >>> +#include <linux/platform_device.h>
> > >>> +#include <linux/cpumask.h>
> > >>
> > >> Alphabetic include file order, please.
> > >>
> > >>> +
> > >>> +#include <asm/cpu_device_id.h>
> > >>> +
> > >>> +#define DRVNAME      "amd_energy"
> > >>> +
> > >>> +#define ENERGY_PWR_UNIT_MSR  0xC0010299
> > >>> +#define ENERGY_CORE_MSR      0xC001029A
> > >>> +#define ENERGY_PCK_MSR               0xC001029B
> > >>> +
> > >>> +#define AMD_TIME_UNIT_MASK   0xF0000
> > >>> +#define AMD_ENERGY_UNIT_MASK 0x01F00
> > >>> +#define AMD_POWER_UNIT_MASK  0x0000F
> > >>> +
> > >>> +#define ENERGY_STATUS_MASK   0xffffffff
> > >>> +
> > >>> +#define AMD_FAM_17           0x17 /* ZP, SSP */
> > >>> +
> > >>> +/* Useful macros */
> > >>> +#define AMD_CPU_FAM_ANY(_family, _model)     \
> > >>> +{                                            \
> > >>> +     .vendor         = X86_VENDOR_AMD,       \
> > >>> +     .family         = _family,              \
> > >>> +     .model          = _model,               \
> > >>> +     .feature        = X86_FEATURE_ANY,      \
> > >>> +}
> > >>> +
> > >>> +#define AMD_CPU_FAM(_model)                  \
> > >>> +     AMD_CPU_FAM_ANY(AMD_FAM_##_model, X86_MODEL_ANY)
> > >>> +
> > >>> +static struct device_attribute units_attr;
> > >>> +
> > >>> +struct sensor_data {
> > >>> +     unsigned int scale;
> > >>> +     union {
> > >>> +             unsigned int cpu_nr;
> > >>> +             unsigned int sock_nr;
> > >>> +     };
> > >>> +     struct device_attribute dev_attr_input;
> > >>> +     char input[32];
> > >>> +};
> > >>> +
> > >>> +struct amd_energy_sensors {
> > >>> +     struct sensor_data *data;
> > >>> +     struct attribute **attrs;
> > >>> +     struct attribute_group group;
> > >>> +     const struct attribute_group *groups[1];
> > >>
> > >> Even if acceptable, this would be wrong. The list of groups
> > >> ends with an empty entry, meaning this array would have to have
> > >> at least two entries (one for the terminator).
> > >>
> > >>> +};
> > >>> +
> > >>> +static ssize_t amd_units_u64_show(struct device *dev,
> > >>> +     struct device_attribute *dev_attr, char *buffer)
> > >>> +{
> > >>> +     uint64_t rapl_units;
> > >>> +     uint64_t energy_unit;
> > >>> +     int ret = 0;
> > >>> +
> > >>> +     ret = rdmsrl_safe(ENERGY_PWR_UNIT_MSR, &rapl_units);
> > >>> +     if (ret)
> > >>> +             return -EAGAIN;
> > >>> +
> > >>> +     energy_unit = (rapl_units & AMD_ENERGY_UNIT_MASK) >> 8;
> > >>> +
> > >>> +     return snprintf(buffer, 24, "%llu\n", energy_unit);
> > >>> +}
> > >>> +
> > >>> +static ssize_t amd_core_u64_show(struct device *dev,
> > >>> +             struct device_attribute *dev_attr, char *buffer)
> > >>> +{
> > >>> +     unsigned long long value;
> > >>> +     struct sensor_data *sensor;
> > >>> +     int ret = 0;
> > >>> +
> > >>> +     sensor = container_of(dev_attr, struct sensor_data, dev_attr_input);
> > >>> +     ret = rdmsrl_safe_on_cpu(sensor->cpu_nr, ENERGY_CORE_MSR, &value);
> > >>> +     if (ret)
> > >>> +             return -EAGAIN;
> > >>> +
> > >>> +     return snprintf(buffer, 24, "%llu\n", value);
> > >>
> > >> It seems to me this reports raw values. This is unacceptable. Values need
> > >> to be scaled to match the ABI. Energy is reported in microJoule.
> > >>
> > >>> +}
> > >>> +
> > >>> +static ssize_t amd_sock_u64_show(struct device *dev,
> > >>> +             struct device_attribute *dev_attr, char *buffer)
> > >>> +{
> > >>> +     unsigned long long value;
> > >>> +     struct sensor_data *sensor;
> > >>> +     int ret = 0;
> > >>> +     int cpu, cpu_nr;
> > >>> +
> > >>> +     sensor = container_of(dev_attr, struct sensor_data, dev_attr_input);
> > >>> +
> > >>> +     for_each_possible_cpu(cpu) {
> > >>> +             struct cpuinfo_x86 *c = &cpu_data(cpu);
> > >>> +
> > >>> +             if (c->initialized && c->logical_die_id == sensor->sock_nr) {
> > >>> +                     cpu_nr = cpu;
> > >>> +                     break;
> > >>> +             }
> > >>> +             cpu_nr = 0;
> > >>> +     }
> > >>> +
> > >>> +     ret = rdmsrl_safe_on_cpu(cpu_nr, ENERGY_PCK_MSR, &value);
> > >>> +     if (ret)
> > >>> +             return -EAGAIN;
> > >>> +
> > >>> +     return snprintf(buffer, 24, "%llu\n", value);
> > >>> +}
> > >>> +
> > >>> +static int amd_energy_probe(struct platform_device *pdev)
> > >>> +{
> > >>> +     struct amd_energy_sensors *amd_sensors;
> > >>> +     struct device *hwdev, *dev = &pdev->dev;
> > >>> +     int nr_cpus, nr_socks, idx = 0;
> > >>> +
> > >>> +     nr_cpus = num_present_cpus();
> > >>> +     nr_socks = topology_max_packages();
> > >>> +
> > >>> +     amd_sensors = devm_kzalloc(dev, sizeof(*amd_sensors), GFP_KERNEL);
> > >>> +     if (!amd_sensors)
> > >>> +             return -ENOMEM;
> > >>> +
> > >>> +     amd_sensors->data = devm_kcalloc(dev, nr_cpus + nr_socks,
> > >>> +                             sizeof(*amd_sensors->data), GFP_KERNEL);
> > >>> +     if (!amd_sensors->data)
> > >>> +             return -ENOMEM;
> > >>> +
> > >>> +     amd_sensors->attrs = devm_kcalloc(dev, nr_cpus + nr_socks,
> > >>> +                             sizeof(*amd_sensors->attrs), GFP_KERNEL);
> > >>> +     if (!amd_sensors->attrs)
> > >>> +             return -ENOMEM;
> > >>> +
> > >>> +     /* populate attributes for number of cpus. */
> > >>> +     for (idx = 0; idx < ; idx++) {
> > >>> +             struct sensor_data *sensor = &amd_sensors->data[idx];
> > >>> +
> > >>> +             snprintf(sensor->input, sizeof(sensor->input),
> > >>> +                             "core_energy%d_input", idx);
> > >>> +
> > >>
> > >> This is unacceptable. Please use standard attributes (energyX_input).
> > >> If you want to report what the sensor is for, use labels.
> > >>
> > >>> +             sensor->dev_attr_input.attr.mode = 0444;
> > >>> +             sensor->dev_attr_input.attr.name = sensor->input;
> > >>> +             sensor->dev_attr_input.show = amd_core_u64_show;
> > >>> +
> > >>> +             sensor->cpu_nr = idx;
> > >>> +             amd_sensors->attrs[idx] = &sensor->dev_attr_input.attr;
> > >>> +
> > >>> +             sysfs_attr_init(amd_sensors->attrs[idx]);
> > >>> +     }
> > >>> +
> > >>> +     /* populate attributes for number of sockets. */
> > >>> +     for (idx = 0; idx < nr_socks; idx++) {
> > >>> +             struct sensor_data *sensor =
> > >>> +                     &amd_sensors->data[nr_cpus + idx];
> > >>> +
> > >>> +             snprintf(sensor->input,
> > >>> +                     sizeof(sensor->input), "sock_energy%d_input", idx);
> > >>> +
> > >>> +             sensor->dev_attr_input.attr.mode = 0444;
> > >>> +             sensor->dev_attr_input.attr.name = sensor->input;
> > >>> +             sensor->dev_attr_input.show = amd_sock_u64_show;
> > >>> +
> > >>> +             sensor->sock_nr = idx;
> > >>> +             amd_sensors->attrs[nr_cpus + idx] =
> > >>> +                     &sensor->dev_attr_input.attr;
> > >>> +
> > >>> +             sysfs_attr_init(amd_sensors->attrs[nr_cpus + idx]);
> > >>> +     }
> > >>
> > >> This all makes me wonder what is reported for inactive/disabled CPUs.
> > >>
> > >>> +
> > >>> +     amd_sensors->group.attrs = amd_sensors->attrs;
> > >>> +     amd_sensors->groups[0] = &amd_sensors->group;
> > >>> +
> > >>> +     platform_set_drvdata(pdev, amd_sensors);
> > >>> +
> > >>> +     hwdev = devm_hwmon_device_register_with_groups(dev,
> > >>> +                     "amd_energy", amd_sensors, amd_sensors->groups);
> > >>
> > >> Please rework the driver to use the devm_hwmon_device_register_with_info() API.
> > > Our current, platform has 64 cores per socket and 2 such sockets on a
> > > board. There is a core energy counter register for each core and
> > > Package counter for each socket.  The topology varies from platform to
> > > platform.
> > >
> > >
> > > To keep this driver reusable for all platforms. I think, we need to
> > > define the hwmon_chip_info and channel_info structures dynamically
> > > (hwmon_chip_info has const variables). Is there a way to define a
> > > pre-processor statement which can create an array for a given
> > > platform.
> > >
> > > Could you suggest me a way to defining these hwmon_chip_info and
> > > channel array dynamically ?  This reason a reason for me to use groups
> > > instead of the chip_info.
> >
> > The approach used by the tmp421 driver should work. 'const'
> > doesn't apply to the pointers, but to the data they contain.
> > But that doesn't mean the data itself has to be const. You
> > can still assign the pointers dynamically. The  habanalabs
> > driver does it fully dynamically (I did not review that code,
> > so it may be a bit messy).
> Thank you very much for the pointer. I will go through the code.
I've the few more questions

Constraint:
The platform has 2 sets of energy sensors, one at core level, one at
socket level.
If i populate the chip_info type as "hwmon_energy" for both sensors.
The naming of the sysfs entries are going to be continuous and the
user application
should read the label to identify which entry belongs to which sensor set.

Possible solutions :
I could think of following ways to avoid this
1. Create 2 different hwmon devices
2. Use "hwmon_in" as type for one of the sensor sets and
"hwmon_energy" for other.
3. Use "groups" for one of the sensor sets and populate the other via chip.
What do you suggest ?

Also, I noticed that the sysfs filename index for the hwmon_energy
type is starting with 1,
while hwmon_in starts with 0. Was this a design choice ?
Ex: energy1_input

Your suggestions would be helpful.


> >
> > Guenter
>
>
>
> --
> Shine bright,
> (: Nav :)



--
Shine bright,
(: Nav :)



[Index of Archives]     [LM Sensors]     [Linux Sound]     [ALSA Users]     [ALSA Devel]     [Linux Audio Users]     [Linux Media]     [Kernel]     [Gimp]     [Yosemite News]     [Linux Media]

  Powered by Linux