+ mm-page_alloc-check-for-max-order-in-hot-path.patch added to -mm tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



The patch titled
     Subject: mm, page_alloc: check for max order in hot path
has been added to the -mm tree.  Its filename is
     mm-page_alloc-check-for-max-order-in-hot-path.patch

This patch should soon appear at
    http://ozlabs.org/~akpm/mmots/broken-out/mm-page_alloc-check-for-max-order-in-hot-path.patch
and later at
    http://ozlabs.org/~akpm/mmotm/broken-out/mm-page_alloc-check-for-max-order-in-hot-path.patch

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next and is updated
there every 3-4 working days

------------------------------------------------------
From: Michal Hocko <mhocko@xxxxxxxx>
Subject: mm, page_alloc: check for max order in hot path

Konstantin has noticed that kvmalloc might trigger the following warning:

[Thu Nov  1 08:43:56 2018] WARNING: CPU: 0 PID: 6676 at mm/vmstat.c:986 __fragmentation_index+0x54/0x60
[...]
[Thu Nov  1 08:43:56 2018] Call Trace:
[Thu Nov  1 08:43:56 2018]  fragmentation_index+0x76/0x90
[Thu Nov  1 08:43:56 2018]  compaction_suitable+0x4f/0xf0
[Thu Nov  1 08:43:56 2018]  shrink_node+0x295/0x310
[Thu Nov  1 08:43:56 2018]  node_reclaim+0x205/0x250
[Thu Nov  1 08:43:56 2018]  get_page_from_freelist+0x649/0xad0
[Thu Nov  1 08:43:56 2018]  ? get_page_from_freelist+0x2d4/0xad0
[Thu Nov  1 08:43:56 2018]  ? release_sock+0x19/0x90
[Thu Nov  1 08:43:56 2018]  ? do_ipv6_setsockopt.isra.5+0x10da/0x1290
[Thu Nov  1 08:43:56 2018]  __alloc_pages_nodemask+0x12a/0x2a0
[Thu Nov  1 08:43:56 2018]  kmalloc_large_node+0x47/0x90
[Thu Nov  1 08:43:56 2018]  __kmalloc_node+0x22b/0x2e0
[Thu Nov  1 08:43:56 2018]  kvmalloc_node+0x3e/0x70
[Thu Nov  1 08:43:56 2018]  xt_alloc_table_info+0x3a/0x80 [x_tables]
[Thu Nov  1 08:43:56 2018]  do_ip6t_set_ctl+0xcd/0x1c0 [ip6_tables]
[Thu Nov  1 08:43:56 2018]  nf_setsockopt+0x44/0x60
[Thu Nov  1 08:43:56 2018]  SyS_setsockopt+0x6f/0xc0
[Thu Nov  1 08:43:56 2018]  do_syscall_64+0x67/0x120
[Thu Nov  1 08:43:56 2018]  entry_SYSCALL_64_after_hwframe+0x3d/0xa2

the problem is that we only check for an out of bound order in the slow
path and the node reclaim might happen from the fast path already.  This
is fixable by making sure that kvmalloc doesn't ever use kmalloc for
requests that are larger than KMALLOC_MAX_SIZE but this also shows that
the code is rather fragile.  A recent UBSAN report just underlines that by
the following report

 UBSAN: Undefined behaviour in mm/page_alloc.c:3117:19
 shift exponent 51 is too large for 32-bit type 'int'
 CPU: 0 PID: 6520 Comm: syz-executor1 Not tainted 4.19.0-rc2 #1
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
 Call Trace:
  __dump_stack lib/dump_stack.c:77 [inline]
  dump_stack+0xd2/0x148 lib/dump_stack.c:113
  ubsan_epilogue+0x12/0x94 lib/ubsan.c:159
  __ubsan_handle_shift_out_of_bounds+0x2b6/0x30b lib/ubsan.c:425
  __zone_watermark_ok+0x2c7/0x400 mm/page_alloc.c:3117
  zone_watermark_fast mm/page_alloc.c:3216 [inline]
  get_page_from_freelist+0xc49/0x44c0 mm/page_alloc.c:3300
  __alloc_pages_nodemask+0x21e/0x640 mm/page_alloc.c:4370
  alloc_pages_current+0xcc/0x210 mm/mempolicy.c:2093
  alloc_pages include/linux/gfp.h:509 [inline]
  __get_free_pages+0x12/0x60 mm/page_alloc.c:4414
  dma_mem_alloc+0x36/0x50 arch/x86/include/asm/floppy.h:156
  raw_cmd_copyin drivers/block/floppy.c:3159 [inline]
  raw_cmd_ioctl drivers/block/floppy.c:3206 [inline]
  fd_locked_ioctl+0xa00/0x2c10 drivers/block/floppy.c:3544
  fd_ioctl+0x40/0x60 drivers/block/floppy.c:3571
  __blkdev_driver_ioctl block/ioctl.c:303 [inline]
  blkdev_ioctl+0xb3c/0x1a30 block/ioctl.c:601
  block_ioctl+0x105/0x150 fs/block_dev.c:1883
  vfs_ioctl fs/ioctl.c:46 [inline]
  do_vfs_ioctl+0x1c0/0x1150 fs/ioctl.c:687
  ksys_ioctl+0x9e/0xb0 fs/ioctl.c:702
  __do_sys_ioctl fs/ioctl.c:709 [inline]
  __se_sys_ioctl fs/ioctl.c:707 [inline]
  __x64_sys_ioctl+0x7e/0xc0 fs/ioctl.c:707
  do_syscall_64+0xc4/0x510 arch/x86/entry/common.c:290
  entry_SYSCALL_64_after_hwframe+0x49/0xbe

Note that this is not a kvmalloc path.  It is just that the fast path
really depends on having sanitzed order as well.  Therefore move the order
check to the fast path.

Link: http://lkml.kernel.org/r/20181113094305.GM15120@xxxxxxxxxxxxxx
Signed-off-by: Michal Hocko <mhocko@xxxxxxxx>
Reported-by: Konstantin Khlebnikov <khlebnikov@xxxxxxxxxxxxxx>
Reported-by: Kyungtae Kim <kt0755@xxxxxxxxx>
Acked-by: Vlastimil Babka <vbabka@xxxxxxx>
Cc: Balbir Singh <bsingharora@xxxxxxxxx>
Cc: Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx>
Cc: Pavel Tatashin <pavel.tatashin@xxxxxxxxxxxxx>
Cc: Oscar Salvador <osalvador@xxxxxxx>
Cc: Mike Rapoport <rppt@xxxxxxxxxxxxxxxxxx>
Cc: Aaron Lu <aaron.lu@xxxxxxxxx>
Cc: Joonsoo Kim <iamjoonsoo.kim@xxxxxxx>
Cc: Byoungyoung Lee <lifeasageek@xxxxxxxxx>
Cc: "Dae R. Jeong" <threeearcat@xxxxxxxxx>
Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
---

 mm/page_alloc.c |   20 +++++++++-----------
 1 file changed, 9 insertions(+), 11 deletions(-)

--- a/mm/page_alloc.c~mm-page_alloc-check-for-max-order-in-hot-path
+++ a/mm/page_alloc.c
@@ -4061,17 +4061,6 @@ __alloc_pages_slowpath(gfp_t gfp_mask, u
 	int reserve_flags;
 
 	/*
-	 * In the slowpath, we sanity check order to avoid ever trying to
-	 * reclaim >= MAX_ORDER areas which will never succeed. Callers may
-	 * be using allocators in order of preference for an area that is
-	 * too large.
-	 */
-	if (order >= MAX_ORDER) {
-		WARN_ON_ONCE(!(gfp_mask & __GFP_NOWARN));
-		return NULL;
-	}
-
-	/*
 	 * We also sanity check to catch abuse of atomic reserves being used by
 	 * callers that are not in atomic context.
 	 */
@@ -4364,6 +4353,15 @@ __alloc_pages_nodemask(gfp_t gfp_mask, u
 	gfp_t alloc_mask; /* The gfp_t that was actually used for allocation */
 	struct alloc_context ac = { };
 
+	/*
+	 * There are several places where we assume that the order value is sane
+	 * so bail out early if the request is out of bound.
+	 */
+	if (unlikely(order >= MAX_ORDER)) {
+		WARN_ON_ONCE(!(gfp_mask & __GFP_NOWARN));
+		return NULL;
+	}
+
 	gfp_mask &= gfp_allowed_mask;
 	alloc_mask = gfp_mask;
 	if (!prepare_alloc_pages(gfp_mask, order, preferred_nid, nodemask, &ac, &alloc_mask, &alloc_flags))
_

Patches currently in -mm which might be from mhocko@xxxxxxxx are

mm-memory_hotplug-check-zone_movable-in-has_unmovable_pages.patch
mm-page_alloc-check-for-max-order-in-hot-path.patch
mm-print-more-information-about-mapping-in-__dump_page.patch
mm-lower-the-printk-loglevel-for-__dump_page-messages.patch
mm-memory_hotplug-drop-pointless-block-alignment-checks-from-__offline_pages.patch
mm-memory_hotplug-print-reason-for-the-offlining-failure.patch
mm-memory_hotplug-be-more-verbose-for-memory-offline-failures.patch
mm-memory_hotplug-do-not-clear-numa_node-association-after-hot_remove.patch




[Index of Archives]     [Kernel Archive]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]

  Powered by Linux