Hi Matt and Badal, [...] > > +}; > > + > > +enum hwm_reg_operation { > > + reg_read, > > + reg_write, > > + reg_rmw, > > Upper case? > > > +}; > > + > > s/hwm_reg/xe_hwm_reg I agree with Matt here... throughout this series of patches the naming is very confusing, sometimes starting with xe_*, sometimes with hwm_*, sometimes with hwmon_*... there is no consistency. Please, Badal, choos a consistent prefix and stick with it for every function and global definition. Matt suggested xe_hwmon_*, I'm fine with it. [...] > > struct hwm_drvdata { > > struct xe_hwmon *hwmon; > > struct device *hwmon_dev; > > + struct xe_gt *gt; > > char name[12]; > > + bool reset_in_progress; > > + wait_queue_head_t waitq; > > }; > > > > struct xe_hwmon { > > struct hwm_drvdata ddat; > > struct mutex hwmon_lock; > > + int scl_shift_power; > > }; > > > > Same as previous patch, 1 struct seems like a better idea to me. I made the same comment in the previous patch. [...] > > + switch (reg_name) { > > + case pkg_rapl_limit: > > + if (xe->info.platform == XE_DG2) > > + return PCU_CR_PACKAGE_RAPL_LIMIT; > > + else if (xe->info.platform == XE_PVC) > > + return PVC_GT0_PACKAGE_RAPL_LIMIT; > > + break; > > + case pkg_power_sku: > > + if (xe->info.platform == XE_DG2) > > + return PCU_CR_PACKAGE_POWER_SKU; > > + else if (xe->info.platform == XE_PVC) > > + return PVC_GT0_PACKAGE_POWER_SKU; > > + break; > > + case pkg_power_sku_unit: > > + if (xe->info.platform == XE_DG2) > > + return PCU_CR_PACKAGE_POWER_SKU_UNIT; > > + else if (xe->info.platform == XE_PVC) > > + return PVC_GT0_PACKAGE_POWER_SKU_UNIT; > > + break; > > + default: > > BUG_ON or WARN_ON saying not possible? MISSING_CASE() is in i915_utils.h, perhaps we can move it to a more generic place... it would be at handy here. > > +static int hwm_power_max_read(struct hwm_drvdata *ddat, long *value) > > The return value is always 0, why not return value? > > s/hwm_power_max_read/xe_hwmon_power_max_read > > > +{ > > + struct xe_hwmon *hwmon = ddat->hwmon; > > + u32 reg_val; > > + u64 r, min, max; > > + > > + xe_device_mem_access_get(gt_to_xe(ddat->gt)); > > + > > Same as above, use xe_device_assert_mem_access. > > > + process_hwmon_reg(ddat, pkg_rapl_limit, reg_read, ®_val, 0, 0); > > + /* Check if PL1 limit is disabled */ > > + if (!(reg_val & PKG_PWR_LIM_1_EN)) { > > + *value = PL1_DISABLE; > > + xe_device_mem_access_put(gt_to_xe(ddat->gt)); > > + return 0; > > + } > > + > > + reg_val = REG_FIELD_GET(PKG_PWR_LIM_1, reg_val); > > + *value = mul_u64_u32_shr(reg_val, SF_POWER, hwmon->scl_shift_power); > > + > > + process_hwmon_reg_read64(ddat, pkg_power_sku, &r); > > + min = REG_FIELD_GET(PKG_MIN_PWR, r); > > + min = mul_u64_u32_shr(min, SF_POWER, hwmon->scl_shift_power); > > + max = REG_FIELD_GET(PKG_MAX_PWR, r); > > + max = mul_u64_u32_shr(max, SF_POWER, hwmon->scl_shift_power); > > + > > + if (min && max) > > + *value = clamp_t(u64, *value, min, max); > > + > > + xe_device_mem_access_put(gt_to_xe(ddat->gt)); > > + return 0; > > +} > > + > > +static int hwm_power_max_write(struct hwm_drvdata *ddat, long value) > > +{ > > + struct xe_hwmon *hwmon = ddat->hwmon; > > + DEFINE_WAIT(wait); > > + int ret = 0; > > + u32 nval; > > + > > + /* Block waiting for GuC reset to complete when needed */ > > + for (;;) { with a do...while() you shouldn't need a for(;;)... your choice, not going to beat on that. > > + mutex_lock(&hwmon->hwmon_lock); > > + > > + prepare_to_wait(&ddat->waitq, &wait, TASK_INTERRUPTIBLE); > > + > > + if (!hwmon->ddat.reset_in_progress) > > + break; > > + > > + if (signal_pending(current)) { > > + ret = -EINTR; > > + break; cough! cough! unlock! cough! cough! > > + } > > + > > + mutex_unlock(&hwmon->hwmon_lock); > > + > > + schedule(); > > + } > > + finish_wait(&ddat->waitq, &wait); > > + if (ret) > > + goto unlock; > > Anyway to not open code this? We similar in with > xe_guc_submit_reset_wait, could we expose a global reset in progress in > function which we can expose at the gt level? > > > + > > + xe_device_mem_access_get(gt_to_xe(ddat->gt)); > > + > > This certainly is an outer most thing, e.g. doing this under > hwmon->hwmon_lock seems dangerous. Again the upper levels of the stack > should do the xe_device_mem_access_get, which it does. Just pointing out > doing xe_device_mem_access_get/put under a lock isn't a good idea. > > Also the the loop which acquires hwmon->hwmon_lock is confusing too. > > > + /* Disable PL1 limit and verify, as limit cannot be disabled on all platforms */ > > + if (value == PL1_DISABLE) { > > + process_hwmon_reg(ddat, pkg_rapl_limit, reg_rmw, &nval, > > + PKG_PWR_LIM_1_EN, 0); > > + process_hwmon_reg(ddat, pkg_rapl_limit, reg_read, &nval, > > + PKG_PWR_LIM_1_EN, 0); > > + > > + if (nval & PKG_PWR_LIM_1_EN) > > + ret = -ENODEV; > > + goto exit; cough! cough! lock! cough! cough! > > + } > > + > > + /* Computation in 64-bits to avoid overflow. Round to nearest. */ > > + nval = DIV_ROUND_CLOSEST_ULL((u64)value << hwmon->scl_shift_power, SF_POWER); > > + nval = PKG_PWR_LIM_1_EN | REG_FIELD_PREP(PKG_PWR_LIM_1, nval); > > + > > + process_hwmon_reg(ddat, pkg_rapl_limit, reg_rmw, &nval, > > + PKG_PWR_LIM_1_EN | PKG_PWR_LIM_1, nval); > > +exit: > > + xe_device_mem_access_put(gt_to_xe(ddat->gt)); > > +unlock: > > + mutex_unlock(&hwmon->hwmon_lock); > > + mmhhh???... jokes apart this is so error prone that it will deadlock as soon as someone will think of editing this file :) It worried me already at the first part. Please, as Matt said, have a more linear locking here. [...] Andi