[PATCH RFC v1 00/11] hwpoison improvement part 1

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi everyone,

I wrote hwpoison patches which partially mention the problems
discussed recently on this area [1].

Main point of this series is how we isolate faulty pages more
safely/reliable. As pointed out from Michal in thread [2], we can
have better isolation functions rather than what we currently have.
Patch 8/11 gives the implementation. As a result, the behavior of
poisoned pages (at least from soft-offline) are more predictable
and I think that memory hotremove should properly work with it.

The structure of this series:
  - patch 1-7 are small fixes, preparation, and/or cleanup.
    I can separate these out from main part if you like.
  - patch 8 is core part of this series, providing some code
    to pick out the target page from buddy allocator,
  - patch 9-11 are changes on caller sides (hard-offline,
    hotremove and unpoison.)

One big issue not addressed by this series is hard-offlining hugetlb,
which is still a todo unfortunately.

Another remaining work is to rework on the behavior of PG_hwpoison
flag from hard-offlining of in-use page. Even with this series,
hard-offline for in-use pages works as in the past (i.e. we still take
racy "set PG_hwpoison at first, then do some handling" approach.)
Without changing this, we can't be free from many "if (PageHWPoison)"
checks in mm code. So I'll think/try more about it after this one.

Anyway this is the first step for better solution (I believe,)
and any kind of help is applicated.

Thanks,
Naoya Horiguchi

[1]: https://lwn.net/Articles/753261/
[2]: https://lkml.org/lkml/2018/7/17/60
---
Summary:

Naoya Horiguchi (11):
      mm: hwpoison: cleanup unused PageHuge() check
      mm: soft-offline: add missing error check of set_hwpoison_free_buddy_page()
      mm: move definition of num_poisoned_pages_inc/dec to include/linux/mm.h
      mm: madvise: call soft_offline_page() without MF_COUNT_INCREASED
      mm: hwpoison-inject: don't pin for hwpoison_filter()
      mm: hwpoison: remove MF_COUNT_INCREASED
      mm: remove flag argument from soft offline functions
      mm: soft-offline: isolate error pages from buddy freelist
      mm: hwpoison: apply buddy page handling code to hard-offline
      mm: clear PageHWPoison in memory hotremove
      mm: hwpoison: introduce clear_hwpoison_free_buddy_page()

 drivers/base/memory.c      |   2 +-
 include/linux/mm.h         |  22 ++++++---
 include/linux/page-flags.h |   8 +++-
 include/linux/swapops.h    |  16 -------
 mm/hwpoison-inject.c       |  18 ++------
 mm/madvise.c               |  25 +++++-----
 mm/memory-failure.c        | 112 ++++++++++++++++++++++++++-------------------
 mm/migrate.c               |   9 ----
 mm/page_alloc.c            |  95 +++++++++++++++++++++++++++++++++++---
 mm/sparse.c                |   2 +-
 10 files changed, 193 insertions(+), 116 deletions(-)




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux