Re: [PATCH v1 0/3] Introduce per NUMA node memory error statistics

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Jan 17, 2023 at 10:34 AM Luck, Tony <tony.luck@xxxxxxxxx> wrote:
>
> > For SRAO signaled via **machine check exception**, my reading of the
> > current x86 MCE code is this:
> ...
> > 3) therefore, do_machine_check just skips kill_me_now or
> > kill_me_maybe, and directly goto out:
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/x86/kernel/cpu/mce/core.c#n1539
>
> That does appear to be what we do. But it looks like a regression from older
> behavior. An SRAO machine check *ought* to call memory_failure() without
> the MF_ACTION_REQUIRED bit set in flags.
>
> -Tony
>

Oh, maybe SRAO signaled via MCE calls memory_failure() with these
async code paths?

1. __mc_scan_banks => mce_log => mce_gen_pool_add + irq_work_queue(mce_irq_work)

2. mce_irq_work_cb => mce_schedule_work => schedule_work(&mce_work)

3. mce_work => mce_gen_pool_process =>
blocking_notifier_call_chain(&x86_mce_decoder_chain, 0, mce)
  => mce_uc_nb => uc_decode_notifier => memory_failure




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux