On 10/31/2022 15:47, Rajat Jain wrote:
Hello,
On Mon, Oct 31, 2022 at 12:39 PM Limonciello, Mario
<Mario.Limonciello@xxxxxxx <mailto:Mario.Limonciello@xxxxxxx>> wrote:
>
> [Public]
>
> > -----Original Message-----
> > From: Sven van Ashbrook <svenva@xxxxxxxxxxxx
<mailto:svenva@xxxxxxxxxxxx>>
> > Sent: Friday, October 28, 2022 23:12
> > To: Rajneesh Bhardwaj <irenic.rajneesh@xxxxxxxxx
<mailto:irenic.rajneesh@xxxxxxxxx>>; Hans de Goede
> > <hdegoede@xxxxxxxxxx <mailto:hdegoede@xxxxxxxxxx>>
> > Cc: Limonciello, Mario <Mario.Limonciello@xxxxxxx
<mailto:Mario.Limonciello@xxxxxxx>>; LKML <linux-
> > kernel@xxxxxxxxxxxxxxx <mailto:kernel@xxxxxxxxxxxxxxx>>; S-k,
Shyam-sundar <Shyam-sundar.S-
> > k@xxxxxxx <mailto:k@xxxxxxx>>; rrangel@xxxxxxxxxxxx
<mailto:rrangel@xxxxxxxxxxxx>; platform-driver-
> > x86@xxxxxxxxxxxxxxx <mailto:x86@xxxxxxxxxxxxxxx>; Rajneesh Bhardwaj
<rajneesh.bhardwaj@xxxxxxxxx <mailto:rajneesh.bhardwaj@xxxxxxxxx>>;
> > Rafael J Wysocki <rjw@xxxxxxxxxxxxx <mailto:rjw@xxxxxxxxxxxxx>>;
Rajat Jain <rajatja@xxxxxxxxxx <mailto:rajatja@xxxxxxxxxx>>;
> > David E Box <david.e.box@xxxxxxxxx <mailto:david.e.box@xxxxxxxxx>>;
Mark Gross <markgross@xxxxxxxxxx <mailto:markgross@xxxxxxxxxx>>
> > Subject: Re: [PATCH v1] platform/x86: intel_pmc_core: promote S0ix
failure
> > warn() to WARN()
> >
> > On Thu, Oct 27, 2022 at 12:02 PM Rajneesh Bhardwaj
> > <irenic.rajneesh@xxxxxxxxx <mailto:irenic.rajneesh@xxxxxxxxx>> wrote:
> > > I'd advise against this promotion based on my experience with
S0ix entry
> > failures.
> >
> > On Thu, Oct 27, 2022 at 11:40 AM Hans de Goede <hdegoede@xxxxxxxxxx
<mailto:hdegoede@xxxxxxxxxx>>
> > wrote:
> > > I'm not a fan of the change you are suggesting here.
> >
> > Thanks everyone for the feedback. Looks like there is consensus
that it's
> > not advisable to promote the warning. We will move forward with
changes to
> > our monitoring infrastructure instead.
>
> Did you see the idea proposed by David Box to introduce some
infrastructure in
> the kernel for this?
>
> Just thinking about it a little bit more, it could be a lot nicer to
have something like:
>
> /sys/power/suspend_stats/last_hw_deepest_state
>
> During the resume process drivers such as amd_pmc and intel_pmc_core
could
> read the appropriate values for the hardware and call a function that
would
> populate it with either a "0" or "1" or maybe even the amount of time
spent in
> that state.
>
> We could then retire the debugging messages from both drivers
I do not think we should retire the debug messages. The sysfs
attribute could help us *trigger* a failure detection, but we would
still need these debug logs to actually determine why exactly we did
not go into the S0ix / deepest power state (And the debug messages
print out the debug register bit fields that let us know that).
Thanks,
I just spun together an RFC series for this idea and while doing it I
had the same realization. So I left the warning messages in place for
both drivers.
You can take a look at the series here:
https://lore.kernel.org/platform-driver-x86/20221031204320.22464-1-mario.limonciello@xxxxxxx/T/#m6c7db55c98b8a3ce8c48d451fc01c1d9b0df37fb
Thanks,