Re: [PATCH] mm/page_owner: fix a crash after memory offline

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 10.10.19 20:32, Qian Cai wrote:
The linux-next series "mm/memory_hotplug: Shrink zones before removing
memory" [1] seems make a crash easier to reproduce while reading
/proc/pagetypeinfo after offlining a memory section. Fix it by using
pfn_to_online_page() in the PFN walker.

Can you please rephrase the subject+description to describe the actual problem and drop the reference to the series?

E.g., similar to my recent patches:

"mm/page_owner: Don't access uninitialized memmaps when reading /proc/pagetypeinfo

Uninitialized memmaps contain garbage and in the worst case trigger kernel BUGs, especially with CONFIG_PAGE_POISONING. They should not get
touched.

For example, when not onlining a memory block that is spanned by a zone and reading /proc/pagetypeinfo, we can trigger a kernel BUG: ...
"

However, you also have to justify why it is okay to no longer consider ZONE_DEVICE (I think walk_zones_in_node() will skip ZONE_DEVICE due to assert_populated == true and ZONE_DEVICE will never be populated, Therefore, we will never end in this code path with ZONE_DEVICE).



[1] https://lore.kernel.org/linux-mm/20191006085646.5768-1-david@xxxxxxxxxx/

page:ffffea0021200000 is uninitialized and poisoned
raw: ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff
raw: ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff
page dumped because: VM_BUG_ON_PAGE(PagePoisoned(p))
There is not page extension available.
------------[ cut here ]------------
kernel BUG at include/linux/mm.h:1107!
RIP: 0010:pagetypeinfo_showmixedcount_print+0x4fb/0x680
Call Trace:
  walk_zones_in_node+0x3a/0xc0
  pagetypeinfo_show+0x260/0x2c0
  seq_read+0x27e/0x710
  proc_reg_read+0x12e/0x190
  __vfs_read+0x50/0xa0
  vfs_read+0xcb/0x1e0
  ksys_read+0xc6/0x160
  __x64_sys_read+0x43/0x50
  do_syscall_64+0xcc/0xaec
  entry_SYSCALL_64_after_hwframe+0x49/0xbe

Signed-off-by: Qian Cai <cai@xxxxxx>
---
  mm/page_owner.c | 5 ++---
  1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/mm/page_owner.c b/mm/page_owner.c
index dee931184788..03a6b19b3cdd 100644
--- a/mm/page_owner.c
+++ b/mm/page_owner.c
@@ -296,11 +296,10 @@ void pagetypeinfo_showmixedcount_print(struct seq_file *m,
  		pageblock_mt = get_pageblock_migratetype(page);

What about the pfn_valid() in the outermost loop? You can skip over the whole pageblock if the first page is not online.

  		for (; pfn < block_end_pfn; pfn++) {
-			if (!pfn_valid_within(pfn))
+			page = pfn_to_online_page(pfn);
+			if (!page)
  				continue;
- page = pfn_to_page(pfn);
-
  			if (page_zone(page) != zone)
  				continue;


--

Thanks,

David / dhildenb




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux