Re: [PATCH V2] apei/ghes: fix ghes_poll_func by registering in non-deferrable mode

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jan 9, 2020 at 10:50 AM Borislav Petkov <bp@xxxxxxxxx> wrote:
>
> On Wed, Jan 08, 2020 at 09:17:38AM -0800, Bhaskar Upadhaya wrote:
> > Currently Linux register ghes_poll_func with TIMER_DEFERRABLE flag,
> > because of which it is serviced when the CPU eventually wakes up with a
> > subsequent non-deferrable timer and not at the configured polling interval.
> >
> > For polling mode, the polling interval configured by firmware should not
> > be exceeded as per ACPI_6_3 spec[refer Table 18-394], So Timer need to
> > be configured in non-deferrable mode by removing TIMER_DEFERRABLE flag.
> > With NO_HZ enabled and timer callback being configured in non-deferrable
> > mode, timer callback will get called exactly after polling interval.
> >
> > Definition of poll interval as per spec (referred ACPI 6.3):
> > "Indicates the poll interval in milliseconds OSPM should use to
> > periodically check the error source for the presence of an error
> > condition"
> >
> > We are observing an issue in our ThunderX2 platforms wherein
> > ghes_poll_func is not called within poll interval when timer is
> > configured with TIMER_DEFERRABLE flag(For NO_HZ kernel) and hence
> > we are losing the error records.
> >
> > Impact of removing TIMER_DEFFERABLE flag
> > - With NO_HZ enabled, additional timer ticks and unnecessary wakeups of
> >  the cpu happens exactly after polling interval.
> >
> > - If polling interval is too small than polling function will be called
> >  too frequently which may stall the cpu.
>
> If that becomes a problem, the polling interval setting should be fixed
> to filter too small values.
>
> Anyway, I went and streamlined your commit message:
>
>     apei/ghes: Do not delay GHES polling
>
>     Currently, the ghes_poll_func() timer callback is registered with the
>     TIMER_DEFERRABLE flag. Thus, it is run when the CPU eventually wakes
>     up together with a subsequent non-deferrable timer and not at the precisely
>     configured polling interval.
>
>     For polling mode, the polling interval configured by firmware should not
>     be exceeded according to the ACPI spec 6.3, Table 18-394. The definition
>     of the polling interval is:
>
>     "Indicates the poll interval in milliseconds OSPM should use to
>     periodically check the error source for the presence of an error
>     condition."
>
>     If this interval is extended due to the timer callback deferring, error
>     records can get lost. Which we are observing on our ThunderX2 platforms.
>
>     Therefore, remove the TIMER_DEFERRABLE flag so that the timer callback
>     executes at the precise interval.
>
> and made it more readable, hopefully.
>
> Rafael, pls fixup when applying.

Done.

> With that:
>
> Acked-by: Borislav Petkov <bp@xxxxxxx>

Thanks!



[Index of Archives]     [Linux IBM ACPI]     [Linux Power Management]     [Linux Kernel]     [Linux Laptop]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]     [Linux Resources]

  Powered by Linux