On Tue, 2019-07-30 at 12:57 -0700, Andrew Morton wrote: > On Sat, 27 Jul 2019 14:23:33 +0100 Catalin Marinas <catalin.marinas@xxxxxxx> > wrote: > > > Add mempool allocations for struct kmemleak_object and > > kmemleak_scan_area as slightly more resilient than kmem_cache_alloc() > > under memory pressure. Additionally, mask out all the gfp flags passed > > to kmemleak other than GFP_KERNEL|GFP_ATOMIC. > > > > A boot-time tuning parameter (kmemleak.mempool) is added to allow a > > different minimum pool size (defaulting to NR_CPUS * 4). > > Why would anyone ever want to alter this? Is there some particular > misbehaviour which this will improve? If so, what is it? So it can tolerant different systems and workloads. For example, there are some machines with slow disk and fast CPUs. When they are under memory pressure, it could take a long time to swap before the OOM kicks in to free up some memory. As the results, it needs a large mempool for kmemleak or suffering from higher chance of a kmemleak metadata allocation failure. > > > --- a/Documentation/admin-guide/kernel-parameters.txt > > +++ b/Documentation/admin-guide/kernel-parameters.txt > > @@ -2011,6 +2011,12 @@ > > Built with CONFIG_DEBUG_KMEMLEAK_DEFAULT_OFF=y, > > the default is off. > > > > + kmemleak.mempool= > > + [KNL] Boot-time tuning of the minimum kmemleak > > + metadata pool size. > > + Format: <int> > > + Default: NR_CPUS * 4 > > + Catalin, BTW, it is right now unable to handle a large size. I tried to reserve 64M (kmemleak.mempool=67108864), [ 0.039254][ T0] WARNING: CPU: 0 PID: 0 at mm/page_alloc.c:4707 __alloc_pages_nodemask+0x3b8/0x1780 [ 0.039284][ T0] Modules linked in: [ 0.039309][ T0] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.3.0-rc2-next- 20190730+ #3 [ 0.039328][ T0] NIP: c000000000395038 LR: c0000000003d9320 CTR: 0000000000000000 [ 0.039355][ T0] REGS: c00000000170f710 TRAP: 0700 Not tainted (5.3.0- rc2-next-20190730+) [ 0.039384][ T0] MSR: 9000000002029033 <SF,HV,VEC,EE,ME,IR,DR,RI,LE> CR: 24000884 XER: 20040000 [ 0.039431][ T0] CFAR: c000000000394cd4 IRQMASK: 0 [ 0.039431][ T0] GPR00: c0000000003d9320 c00000000170f9a0 c000000001708c00 0000000000040cc0 [ 0.039431][ T0] GPR04: 0000000000000010 0000000000000000 0000000000000000 c000000002aac080 [ 0.039431][ T0] GPR08: 0000001ffb3a0000 0000000000000000 c0000000003d9320 0000000000000000 [ 0.039431][ T0] GPR12: 0000000024000882 c000000002760000 0000000000000000 0000000000000000 [ 0.039431][ T0] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [ 0.039431][ T0] GPR20: 0000000000000000 0000000000000001 0000000010004d9c 00000000100053ed [ 0.039431][ T0] GPR24: ffffffffffffffff ffffffffffffffff c0000000002e9544 0000000100000000 [ 0.039431][ T0] GPR28: 0000000000000cc0 0000000100000000 0000000000040cc0 c0000000027e8c48 [ 0.039646][ T0] NIP [c000000000395038] __alloc_pages_nodemask+0x3b8/0x1780 [ 0.039693][ T0] LR [c0000000003d9320] kmalloc_large_node+0x100/0x1a0 [ 0.039727][ T0] Call Trace: [ 0.039749][ T0] [c00000000170f9a0] [0000000000000001] 0x1 (unreliable) [ 0.039776][ T0] [c00000000170fbe0] [0000000000000000] 0x0 [ 0.039795][ T0] [c00000000170fc80] [c0000000003e5080] __kmalloc_node+0x520/0x890 [ 0.039816][ T0] [c00000000170fd20] [c0000000002e9544] mempool_init_node+0xb4/0x1e0 [ 0.039836][ T0] [c00000000170fd80] [c0000000002e975c] mempool_create_node+0xcc/0x150 [ 0.039857][ T0] [c00000000170fdf0] [c000000000b2a730] kmemleak_init+0x16c/0x54c [ 0.039878][ T0] [c00000000170fef0] [c000000000ae460c] start_kernel+0x69c/0x7cc [ 0.039908][ T0] [c00000000170ff90] [c00000000000a7d4] start_here_common+0x1c/0x434 [ 0.039945][ T0] Instruction dump: [ 0.039976][ T0] 4bffff14 e92d0968 39291020 3bc00001 f9210148 4bfffd98 7d435378 4bf94eed [ 0.040012][ T0] 60000000 4bfffdfc 70692000 4082ffd0 <0fe00000> 3bc00000 4bfffedc 39200000 [ 0.040049][ T0] ---[ end trace 038320b411324ff7 ]--- [ 0.040100][ T0] kmemleak: Kernel memory leak detector disabled [ 16.192449][ T1] BUG: Unable to handle kernel data access at 0xffffffffffffb2aa [ 16.192473][ T1] Faulting instruction address: 0xc000000000b2a2fc [ 16.192500][ T1] Oops: Kernel access of bad area, sig: 11 [#1] [ 16.192526][ T1] LE PAGE_SIZE=64K MMU=Radix MMU=Hash SMP NR_CPUS=256 DEBUG_PAGEALLOC NUMA PowerNV [ 16.192567][ T1] Modules linked in: [ 16.192593][ T1] CPU: 4 PID: 1 Comm: swapper/0 Tainted: G W 5.3.0-rc2-next-20190730+ #3 [ 16.192646][ T1] NIP: c000000000b2a2fc LR: c0000000003e6e48 CTR: c0000000000b4380 [ 16.192698][ T1] REGS: c00000002aaef9d0 TRAP: 0380 Tainted: G W (5.3.0-rc2-next-20190730+) [ 16.192750][ T1] MSR: 9000000002009033 <SF,HV,VEC,EE,ME,IR,DR,RI,LE> CR: 28002884 XER: 20040000 [ 16.192801][ T1] CFAR: c00000000043769c IRQMASK: 0 [ 16.192801][ T1] GPR00: c0000000003e6e48 c00000002aaefc60 c000000001708c00 0000000000000002 [ 16.192801][ T1] GPR04: c000000002c42648 0000000000000000 0000000000000000 ffffffff00001e77 [ 16.192801][ T1] GPR08: 0000000000000000 0000000000000001 0000000000000800 0000000000000000 [ 16.192801][ T1] GPR12: 0000000000002000 c000001fffffbc00 c0000000000103d8 0000000000000000 [ 16.192801][ T1] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [ 16.192801][ T1] GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [ 16.192801][ T1] GPR24: 0000000000000000 c000000002aa9c80 c0000000018d0730 c0000000003c9270 [ 16.192801][ T1] GPR28: 000000000000b100 c00c00000000b100 c000000002c42648 c000000002aa9c80 [ 16.193126][ T1] NIP [c000000000b2a2fc] log_early+0x8/0x160 [ 16.193153][ T1] LR [c0000000003e6e48] kmem_cache_free+0x428/0x740 [ 16.193190][ T1] Call Trace: [ 16.193213][ T1] [c00000002aaefc60] [0000000000000366] 0x366 (unreliable) [ 16.193243][ T1] [c00000002aaefd00] [c0000000003c9270] __mpol_put+0x50/0x70 [ 16.193272][ T1] [c00000002aaefd20] [c0000000003c9488] do_set_mempolicy+0x108/0x170 [ 16.193314][ T1] [c00000002aaefdb0] [c000000000010434] kernel_init+0x64/0x150 [ 16.193363][ T1] [c00000002aaefe20] [c00000000000b1cc] ret_from_kernel_thread+0x5c/0x70 [ 16.193412][ T1] Instruction dump: [ 16.193436][ T1] aaaaaaaa aaaaaaaa aaaaaaaa aaaaaaaa aaaaaaaa aaaaaaaa aaaaaaaa aaaaaaaa [ 16.193486][ T1] aaaaaaaa aaaaaaaa aaaaaaaa aaaaaaaa <aaaaaaaa> aaaaaaaa aaaaaaaa aaaaaaaa [ 16.193556][ T1] ---[ end trace 038320b411324ff9 ]--- [ 16.587204][ T1] [ 17.587316][ T1] Kernel panic - not syncing: Fatal exception