On Mon, Apr 3, 2023 at 8:07 PM Limonciello, Mario <mario.limonciello@xxxxxxx> wrote: > > On 4/3/2023 13:00, Box, David E wrote: > > On Fri, 2023-03-31 at 20:05 +0200, Rafael J. Wysocki wrote: > >> On Thu, Mar 30, 2023 at 9:45 PM Mario Limonciello > >> <mario.limonciello@xxxxxxx> wrote: > >>> > >>> intel_pmc_core displays a warning when the module parameter > >>> `warn_on_s0ix_failures` is set and a suspend didn't get to a HW sleep > >>> state. > >>> > >>> Report this to the standard kernel reporting infrastructure so that > >>> userspace software can query after the suspend cycle is done. > >>> > >>> Signed-off-by: Mario Limonciello <mario.limonciello@xxxxxxx> > >>> --- > >>> v4->v5: > >>> * Reword commit message > >>> --- > >>> drivers/platform/x86/intel/pmc/core.c | 2 ++ > >>> 1 file changed, 2 insertions(+) > >>> > >>> diff --git a/drivers/platform/x86/intel/pmc/core.c > >>> b/drivers/platform/x86/intel/pmc/core.c > >>> index e2f171fac094..980af32dd48a 100644 > >>> --- a/drivers/platform/x86/intel/pmc/core.c > >>> +++ b/drivers/platform/x86/intel/pmc/core.c > >>> @@ -1203,6 +1203,8 @@ static inline bool pmc_core_is_s0ix_failed(struct > >>> pmc_dev *pmcdev) > >>> if (pmc_core_dev_state_get(pmcdev, &s0ix_counter)) > >>> return false; > >>> > >>> + pm_set_hw_sleep_time(s0ix_counter - pmcdev->s0ix_counter); > >>> + > >> > >> Maybe check if this is really accumulating? In case of a counter > >> overflow, for instance? > > > > Overflow is likely on some systems. The counter is only 32-bit and at our > > smallest granularity of 30.5us per tick it could overflow after a day and a half > > of s0ix time, though most of our systems have a higher granularity that puts > > them around 6 days. > > > > This brings up an issue that the attribute cannot be trusted if the system is > > suspended for longer than the maximum hardware counter time. Should be noted in > > the Documentation. > > I think it would be rather confusing for userspace having to account for > this and it's better to abstract it in the kernel. > > How can you discover the granularity a system can support? > How would you know overflow actually happened? Is there a bit somewhere > else that could tell you? I'm not really sure if there is a generally usable overflow detection for this. > In terms of ABI how about when we know overflow occurred and userspace > reads the sysfs file we return -EOVERFLOW instead of a potentially bad > value? So if the new value is greater than the old one, you don't really know whether or not an overflow has taken place. And so I would just document the fact that the underlying HW/firmware counter overflows as suggested by Dave.