While porting OpenBMC to a new platform with a Xeon Gold 6314U CPU (Ice Lake, 32 cores), I discovered that the core numbering used by the PECI interface appears to correspond to the cores that are present in the physical silicon, rather than those that are actually enabled and usable by the host OS (i.e. it includes cores that the chip was manufactured with but later had fused off). Thus far the cputemp driver has transparently exposed that numbering to userspace in its 'tempX_label' sysfs files, making the core numbers it reported not align with the core numbering used by the host system, which seems like an unfortunate source of confusion. We can instead use a separate counter to label the cores in a contiguous fashion (0 through numcores-1) so that the core numbering reported by the PECI cputemp driver matches the numbering seen by the host. Signed-off-by: Zev Weiss <zev@xxxxxxxxxxxxxxxxx> --- Offhand I can't think of any other examples of side effects of that manufacturing detail (fused-off cores) leaking out in externally-visible ways, so I'd think it's probably not something we really want to propagate further. I've verified that at least on the system I'm working on the numbering provided by this patch aligns with the host's CPU numbering (loaded each core individually one by one and saw a corresponding temperature increase visible via PECI), but I'm not sure if that relationship is guaranteed to hold on all parts -- Iwona, do you know if that's something we can rely on? This patch also leaves the driver's internal core tracking with the "physical" numbering the PECI interface uses, and hence it's still sort of visible to userspace in the form of the hwmon channel numbers used in the names of the sysfs attribute files. If desired we could also change that to keep the tempX_* file numbers contiguous as well, though it would necessitate a bit of additional remapping in the driver to translate between the two. drivers/hwmon/peci/cputemp.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/drivers/hwmon/peci/cputemp.c b/drivers/hwmon/peci/cputemp.c index 30850a479f61..6b4010cbbfdf 100644 --- a/drivers/hwmon/peci/cputemp.c +++ b/drivers/hwmon/peci/cputemp.c @@ -400,14 +400,15 @@ static int init_core_mask(struct peci_cputemp *priv) static int create_temp_label(struct peci_cputemp *priv) { unsigned long core_max = find_last_bit(priv->core_mask, CORE_NUMS_MAX); - int i; + int i, corenum = 0; priv->coretemp_label = devm_kzalloc(priv->dev, (core_max + 1) * sizeof(char *), GFP_KERNEL); if (!priv->coretemp_label) return -ENOMEM; for_each_set_bit(i, priv->core_mask, CORE_NUMS_MAX) { - priv->coretemp_label[i] = devm_kasprintf(priv->dev, GFP_KERNEL, "Core %d", i); + priv->coretemp_label[i] = devm_kasprintf(priv->dev, GFP_KERNEL, + "Core %d", corenum++); if (!priv->coretemp_label[i]) return -ENOMEM; } -- 2.39.1.236.ga8a28b9eace8