Re: Re: [PATCH v5 1/1] mm/memory-failure: disable unpoison once hw error happens

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 6/15/22 12:23, Oscar Salvador wrote:
On Wed, Jun 15, 2022 at 10:00:05AM +0800, zhenwei pi wrote:
Currently unpoison_memory(unsigned long pfn) is designed for soft
poison(hwpoison-inject) only. Since 17fae1294ad9d, the KPTE gets
cleared on a x86 platform once hardware memory corrupts.

Unpoisoning a hardware corrupted page puts page back buddy only,
the kernel has a chance to access the page with *NOT PRESENT* KPTE.
This leads BUG during accessing on the corrupted KPTE.

Suggested by David&Naoya, disable unpoison mechanism when a real HW error
happens to avoid BUG like this:

  Unpoison: Software-unpoisoned page 0x61234
  BUG: unable to handle page fault for address: ffff888061234000
  #PF: supervisor write access in kernel mode
  #PF: error_code(0x0002) - not-present page
  PGD 2c01067 P4D 2c01067 PUD 107267063 PMD 10382b063 PTE 800fffff9edcb062
  Oops: 0002 [#1] PREEMPT SMP NOPTI
  CPU: 4 PID: 26551 Comm: stress Kdump: loaded Tainted: G   M       OE     5.18.0.bm.1-amd64 #7
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996) ...
  RIP: 0010:clear_page_erms+0x7/0x10
  Code: ...
  RSP: 0000:ffffc90001107bc8 EFLAGS: 00010246
  RAX: 0000000000000000 RBX: 0000000000000901 RCX: 0000000000001000
  RDX: ffffea0001848d00 RSI: ffffea0001848d40 RDI: ffff888061234000
  RBP: ffffea0001848d00 R08: 0000000000000901 R09: 0000000000001276
  R10: 0000000000000003 R11: 0000000000000000 R12: 0000000000000001
  R13: 0000000000000000 R14: 0000000000140dca R15: 0000000000000001
  FS:  00007fd8b2333740(0000) GS:ffff88813fd00000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: ffff888061234000 CR3: 00000001023d2005 CR4: 0000000000770ee0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  PKRU: 55555554
  Call Trace:
   <TASK>
   prep_new_page+0x151/0x170
   get_page_from_freelist+0xca0/0xe20
   ? sysvec_apic_timer_interrupt+0xab/0xc0
   ? asm_sysvec_apic_timer_interrupt+0x1b/0x20
   __alloc_pages+0x17e/0x340
   __folio_alloc+0x17/0x40
   vma_alloc_folio+0x84/0x280
   __handle_mm_fault+0x8d4/0xeb0
   handle_mm_fault+0xd5/0x2a0
   do_user_addr_fault+0x1d0/0x680
   ? kvm_read_and_reset_apf_flags+0x3b/0x50
   exc_page_fault+0x78/0x170
   asm_exc_page_fault+0x27/0x30

Fixes: 847ce401df392 ("HWPOISON: Add unpoisoning support")
Fixes: 17fae1294ad9d ("x86/{mce,mm}: Unmap the entire page if the whole page is affected and poisoned")
Cc: Naoya Horiguchi <naoya.horiguchi@xxxxxxx>
Cc: David Hildenbrand <david@xxxxxxxxxx>
Signed-off-by: zhenwei pi <pizhenwei@xxxxxxxxxxxxx>
---
...
...
diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index b85661cbdc4a..385b5e99bfc1 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -69,6 +69,8 @@ int sysctl_memory_failure_recovery __read_mostly = 1;
atomic_long_t num_poisoned_pages __read_mostly = ATOMIC_LONG_INIT(0); +static bool hw_memory_failure;
+
  static bool __page_handle_poison(struct page *page)
  {
  	int ret;
@@ -1768,6 +1770,9 @@ int memory_failure(unsigned long pfn, int flags)
mutex_lock(&mf_mutex); + if (!(flags & MF_SW_SIMULATED))
+		hw_memory_failure = true;
+
  	p = pfn_to_online_page(pfn);
  	if (!p) {
  		res = arch_memory_failure(pfn, flags);
@@ -2103,6 +2108,13 @@ int unpoison_memory(unsigned long pfn)
mutex_lock(&mf_mutex); + if (hw_memory_failure) {
+		unpoison_pr_info("Unpoison: Disabled after HW memory failure %#lx\n",
+				 pfn, &unpoison_rs);
+		ret = -EOPNOTSUPP;
+		goto unlock_mutex;
+	}

If we disable this, I would move this at the beginning of the function.
We do not really care whether the pfn is valid or getting the head of
the page, etc.
So, unless I'm missing something, this should be enough?


Hi,

Because memory_failure() may be called by hardware error randomly, hw_memory_failure should be protected by mf_mutex to avoid this case:
int unpoison_memory(unsigned long pfn)
{
    ...
    if (hw_memory_failure) {
    }
    ... --> memory_failure() happens, and mark hw_memory_failure as true
    mutex_lock(&mf_mutex);
}

diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index 385b5e99bfc1..ece15f07dee7 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -2100,6 +2100,12 @@ int unpoison_memory(unsigned long pfn)
         static DEFINE_RATELIMIT_STATE(unpoison_rs, DEFAULT_RATELIMIT_INTERVAL,
                                         DEFAULT_RATELIMIT_BURST);
+ if (hw_memory_failure) {
+                unpoison_pr_info("Unpoison: Disabled after HW memory failure\n",
+                                 &unpoison_rs);
+               ret -EOPNOTSUPP;
+        }
+
         if (!pfn_valid(pfn))
                 return -ENXIO;

--
zhenwei pi




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux