Re: [PATCH v2 2/3] mm: memory-failure: Bump memory failure stats to pglist_data

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Feb 1, 2023 at 10:56 PM Kefeng Wang <wangkefeng.wang@xxxxxxxxxx> wrote:
>
>
>
> On 2023/1/20 11:46, Jiaqi Yan wrote:
> > Right before memory_failure finishes its handling, accumulate poisoned
> > page's resolution counters to pglist_data's memory_failure_stats, so as
> > to update the corresponding sysfs entries.
> >
> > Tested:
> > 1) Start an application to allocate memory buffer chunks
> > 2) Convert random memory buffer addresses to physical addresses
> > 3) Inject memory errors using EINJ at chosen physical addresses
> > 4) Access poisoned memory buffer and recover from SIGBUS
> > 5) Check counter values under
> >     /sys/devices/system/node/node*/memory_failure/*
> >
> > Acked-by: David Rientjes <rientjes@xxxxxxxxxx>
> > Signed-off-by: Jiaqi Yan <jiaqiyan@xxxxxxxxxx>
> > ---
> >   mm/memory-failure.c | 36 ++++++++++++++++++++++++++++++++++++
> >   1 file changed, 36 insertions(+)
> >
> > diff --git a/mm/memory-failure.c b/mm/memory-failure.c
> > index c628f1db3a4d..f4990839ea66 100644
> > --- a/mm/memory-failure.c
> > +++ b/mm/memory-failure.c
> > @@ -1227,6 +1227,39 @@ static struct page_state error_states[] = {
> >   #undef slab
> >   #undef reserved
> >
> > +static void update_per_node_mf_stats(unsigned long pfn,
> > +                                  enum mf_result result)
> > +{
> > +     int nid = MAX_NUMNODES;
> > +     struct memory_failure_stats *mf_stats = NULL;
> > +
> > +     nid = pfn_to_nid(pfn);
> > +     if (unlikely(nid < 0 || nid >= MAX_NUMNODES)) {
> > +             WARN_ONCE(1, "Memory failure: pfn=%#lx, invalid nid=%d", pfn, nid);
> > +             return;
> > +     }
> > +
> ...
> > +     default:
> > +             WARN_ONCE(1, "Memory failure: mf_result=%d is not properly handled", result);
> > +             break;
> > +     }
>
> We already define pr_fmt, the "Memory failure:" prefix should be dropped.

"Should be dropped" because it will print duplicated prefixes? Does
WARN_ONCE also automatically include pr_fmt? I don't think that's the
case when I read __warn_printk.

This is what I saw from dmesg when I add a `WARN_ONCE(1, "Memory
failure: pfn=%#lx\n", pfn);`
at the beginning of `update_per_node_mf_stats` (above `nid=pfn_to_nid(pfn)`):

[  523.942688] ------------[ cut here ]------------
[  523.972026] Memory failure: pfn=0x309f8f3
[  523.972038] WARNING: CPU: 4 PID: 21119 at mm/memory-failure.c:1236
action_result+0xec/0x150
[  523.972044] Modules linked in: einj vfat fat i2c_mux_pca954x
i2c_mux spidev cdc_acm xhci_pci xhci_hcd sha3_generic gq(O)
[  523.972054] CPU: 4 PID: 21119 Comm: usemem Tainted: G S M       O
    6.2.0-smp-DEV #1
[  523.972059] RIP: 0010:action_result+0xec/0x150

No duplicated "Memory failure:".

But I realize I should probably add "\n" within WARN_ONCE.




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux