> -----Original Message----- > From: srinivas pandruvada <srinivas.pandruvada@xxxxxxxxxxxxxxx> > Sent: Wednesday, September 6, 2023 8:06 PM > To: Zhang, Rui <rui.zhang@xxxxxxxxx>; rafael@xxxxxxxxxx > Cc: linux-pm@xxxxxxxxxxxxxxx; Pawnikar, Sumeet R > <sumeet.r.pawnikar@xxxxxxxxx>; linux-kernel@xxxxxxxxxxxxxxx; > stable@xxxxxxxxxxxxxxx > Subject: Re: [PATCH] powercap: intel_rapl: Fix setting of Power Limit 4 to 0 > > Hi Rui, > > On Wed, 2023-09-06 at 02:20 +0000, Zhang, Rui wrote: > > Hi, Srinivas, > > > > Thanks for the patch. > > > > On Tue, 2023-09-05 at 17:06 -0700, Srinivas Pandruvada wrote: > > > System runs at minimum performance, once powercap RAPL package > > > domain "enabled" flag is toggled. > > > > > > Setting RAPL package domain enabled flag to 0, results in setting of > > > power limit 4 (PL4) MSR 0x601 to 0. > > > > This is the bug. Setting enabled flag should only affect the Enable > > bit but not the other bits. > > > > > This implies disabling PL4 limit. > > > The PL4 limit controls the peak power. This can significantly change > > > the performance. Even worse, when the enabled flag is set to 1 > > > again. > > > This will set PL4 MSR value to 0x01, which means reduce peak power > > > to 0.125W. This will force the system to run at the lowest possible > > > performance. > > > > > > This is caused by a change which assumes that there is an enable bit > > > in the PL4 MSR like other power limits. > > > > MSR RAPL doesn't have PL4 enable bit, but TPMI RAPL does have per > > power limit enable bit. > That is correct, but not sure that in practice used or not. > > > > > > > > > In functions set_floor_freq_default() and rapl_remove_package(), > > > call > > > rapl_write_pl_data with PL_ENABLE and PL_CLAMP for only power limit > > > 1 > > > and 2. Similarly don't read PL_ENABLE for PL4 to check the presence > > > of > > > power limit 4. > > > > IMO, the rootcause is that we expose an non-exits PL4_ENABLE > > primitive > > for MSR Interface, and even worse we expose it wrongly that the > > primitive uses the power limit bits. > > > > Commit 9050a9cd5e4c ("powercap: intel_rapl: Cleanup Power Limits > > support") pokes the MSR interface PL4_ENABLE primitive and exposes > > this > > bug. > > > > Sumeet is testing the below fix right now, and I suppose he will give > > some update soon. > > Testing still going on. > > thanks, > > rui > > > > diff --git a/drivers/powercap/intel_rapl_common.c > > b/drivers/powercap/intel_rapl_common.c > > index 8fac57b28f8a..d407f918876f 100644 > > --- a/drivers/powercap/intel_rapl_common.c > > +++ b/drivers/powercap/intel_rapl_common.c > > @@ -658,8 +658,6 @@ static struct rapl_primitive_info > > rpi_msr[NR_RAPL_PRIMITIVES] = { > > RAPL_DOMAIN_REG_LIMIT, ARBITRARY_UNIT, > > 0), > > [PL2_CLAMP] = PRIMITIVE_INFO_INIT(PL2_CLAMP, > > POWER_LIMIT2_CLAMP, 48, > > RAPL_DOMAIN_REG_LIMIT, ARBITRARY_UNIT, > > 0), > > - [PL4_ENABLE] = PRIMITIVE_INFO_INIT(PL4_ENABLE, > > POWER_LIMIT4_MASK, 0, > > - RAPL_DOMAIN_REG_PL4, ARBITRARY_UNIT, > > 0), > > [TIME_WINDOW1] = PRIMITIVE_INFO_INIT(TIME_WINDOW1, > > TIME_WINDOW1_MASK, 17, > > RAPL_DOMAIN_REG_LIMIT, TIME_UNIT, 0), > > [TIME_WINDOW2] = PRIMITIVE_INFO_INIT(TIME_WINDOW2, > > TIME_WINDOW2_MASK, 49, > > > > > Let me try this, but this is not enough. The enable/disable is also > gets checked for presence of PL4. > Yes, we need the check as well. Thanks, Sumeet. > Thanks, > Srinivas > > >