Hi Rui, On Wed, 2023-09-06 at 02:20 +0000, Zhang, Rui wrote: > Hi, Srinivas, > > Thanks for the patch. > > On Tue, 2023-09-05 at 17:06 -0700, Srinivas Pandruvada wrote: > > System runs at minimum performance, once powercap RAPL package > > domain > > "enabled" flag is toggled. > > > > Setting RAPL package domain enabled flag to 0, results in setting > > of > > power limit 4 (PL4) MSR 0x601 to 0. > > This is the bug. Setting enabled flag should only affect the Enable > bit > but not the other bits. > > > This implies disabling PL4 limit. > > The PL4 limit controls the peak power. This can significantly > > change > > the performance. Even worse, when the enabled flag is set to 1 > > again. > > This will set PL4 MSR value to 0x01, which means reduce peak power > > to > > 0.125W. This will force the system to run at the lowest possible > > performance. > > > > This is caused by a change which assumes that there is an enable > > bit > > in the PL4 MSR like other power limits. > > MSR RAPL doesn't have PL4 enable bit, but TPMI RAPL does have per > power > limit enable bit. That is correct, but not sure that in practice used or not. > > > > > In functions set_floor_freq_default() and rapl_remove_package(), > > call > > rapl_write_pl_data with PL_ENABLE and PL_CLAMP for only power limit > > 1 > > and 2. Similarly don't read PL_ENABLE for PL4 to check the presence > > of > > power limit 4. > > IMO, the rootcause is that we expose an non-exits PL4_ENABLE > primitive > for MSR Interface, and even worse we expose it wrongly that the > primitive uses the power limit bits. > > Commit 9050a9cd5e4c ("powercap: intel_rapl: Cleanup Power Limits > support") pokes the MSR interface PL4_ENABLE primitive and exposes > this > bug. > > Sumeet is testing the below fix right now, and I suppose he will give > some update soon. > > thanks, > rui > > diff --git a/drivers/powercap/intel_rapl_common.c > b/drivers/powercap/intel_rapl_common.c > index 8fac57b28f8a..d407f918876f 100644 > --- a/drivers/powercap/intel_rapl_common.c > +++ b/drivers/powercap/intel_rapl_common.c > @@ -658,8 +658,6 @@ static struct rapl_primitive_info > rpi_msr[NR_RAPL_PRIMITIVES] = { > RAPL_DOMAIN_REG_LIMIT, ARBITRARY_UNIT, > 0), > [PL2_CLAMP] = PRIMITIVE_INFO_INIT(PL2_CLAMP, > POWER_LIMIT2_CLAMP, 48, > RAPL_DOMAIN_REG_LIMIT, ARBITRARY_UNIT, > 0), > - [PL4_ENABLE] = PRIMITIVE_INFO_INIT(PL4_ENABLE, > POWER_LIMIT4_MASK, 0, > - RAPL_DOMAIN_REG_PL4, ARBITRARY_UNIT, > 0), > [TIME_WINDOW1] = PRIMITIVE_INFO_INIT(TIME_WINDOW1, > TIME_WINDOW1_MASK, 17, > RAPL_DOMAIN_REG_LIMIT, TIME_UNIT, 0), > [TIME_WINDOW2] = PRIMITIVE_INFO_INIT(TIME_WINDOW2, > TIME_WINDOW2_MASK, 49, > > Let me try this, but this is not enough. The enable/disable is also gets checked for presence of PL4. Thanks, Srinivas >