Re: [failures] mm-vmalloc-print-a-warning-message-first-on-failure.patch removed from -mm tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, May 14, 2021 at 11:19:20AM +0100, Mel Gorman wrote:
> On Thu, May 13, 2021 at 10:18:51PM +0200, Uladzislau Rezki wrote:
> > <SNIP>
> >
> > From the trace i get:
> > 
> > <snip>
> > [    0.243916] RIP: 0010:__alloc_pages+0x11e/0x310
> > [    0.243916] Code: 84 c0 0f 85 02 01 00 00 89 d8 48 8b 54 24 08 8b 74 24 1c c1 e8 0c 83 e0 01 88 44 24 20 48 8b 04 24 48 85 d2 0f 85 71 01 00 00 <3b> 70 08 0f 82 68 01 00 00 48 89 44 24 10 48 8b 00 89 da 81 e2 00
> > [    0.243916] RSP: 0000:ffffffffae803c38 EFLAGS: 00010246
> > [    0.243916] RAX: 0000000000001cc0 RBX: 0000000000002102 RCX: 0000000000000004
> > [    0.243916] RDX: 0000000000000000 RSI: 0000000000000002 RDI: 0000000000002102
> > [    0.243916] RBP: 0000000000000000 R08: 0000000000000000 R09: c0000000ffffdfff
> > [    0.243916] R10: 0000000000000001 R11: ffffffffae803ac0 R12: 0000000000000000
> > [    0.243916] R13: 0000000000002102 R14: 0000000000000001 R15: ffffa0938000d000
> > [    0.243916] FS:  0000000000000000(0000) GS:ffff893ab7c00000(0000) knlGS:0000000000000000
> > [    0.243916] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [    0.243916] CR2: 0000000000001cc8 CR3: 0000000176e10000 CR4: 00000000000006b0
> > [    0.243916] Call Trace:
> > [    0.243916]  __alloc_pages_bulk+0xaa1/0xb50
> > <snip>
> > 
> > (gdb) l *__alloc_pages+0x11e
> > 0xffffffff8129d87e is in __alloc_pages (./include/linux/mmzone.h:1095).
> > 1090            return zoneref->zone;
> > 1091    }
> > 1092
> > 1093    static inline int zonelist_zone_idx(struct zoneref *zoneref)
> > 1094    {
> > 1095            return zoneref->zone_idx;
> > 1096    }
> > 1097
> > 1098    static inline int zonelist_node_idx(struct zoneref *zoneref)
> > 1099    {
> > (gdb)
> > 
> > Seems like "zoneref" refers to invalid address.
> > 
> > Thoughts?
> 
> I have not previously read the patch but there are a few concerns and it's
> probably just as well this blew up early. The bulk allocator assumes a
> valid node but the patch can send in NUMA_NO_NODE (-1). 
>
Should the bulk-allocator handle the NUMA_NO_NODE on its own? I mean instead
of handling by user the allocator itself fixes it if NUMA_NO_NODE is passed.

>
> On the high-order path alloc_pages_node is used which checks nid == NUMA_NO_NODE.
> Also, area->pages is not necessarily initialised so that could be interpreted
> as a partially populated array so minmally you need.
>
area->pages are zeroed, because __GFP_ZERO is sued during allocating an array.

> 
> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> index f3c5dd4ccc5b9..b638ff31b07e1 100644
> --- a/mm/vmalloc.c
> +++ b/mm/vmalloc.c
> @@ -2792,6 +2792,9 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask,
>  	page_order = vm_area_page_order(area);
>  
>  	if (!page_order) {
> +		memset(area->pages, 0, array_size);
This memset is not needed as explained above.

> +		if (node == NUMA_NO_NODE)
> +			node = numa_mem_id();
This patch works. Again should it be processed by allocator?

>  		area->nr_pages = __alloc_pages_bulk(gfp_mask, node,
>  			NULL, nr_small_pages, NULL, area->pages);
>  	} else {
> 
> However, the high-order path also looks suspicious. area->nr_pages is
> advanced before the allocation attempt so in the event alloc_pages_node()
> returns NULL prematurely, area->nr_pages does not reflect the number of
> pages allocated so that needs examination.
> 
<snip>
    for (area->nr_pages = 0; area->nr_pages < nr_small_pages;
        area->nr_pages += 1U << page_order) {
<snip>

if alloc_pages_node() fails we break the loop. area->nr_pages is initialized
inside the for(...) loop, thus it will be zero if the single page allocator
fails on a first iteration.

Or i miss your point?

> As an aside, where or what is test_vmalloc.sh? It appears to have been
> used a few times but it's not clear it's representative so are you aware
> of workloads that are vmalloc-intensive? It does not matter for the
> patch as such but it would be nice to know examples of vmalloc-intensive
> workloads because I cannot recall a time during the last few years where
> I saw vmalloc.c high in profiles.
> 
test_vmalloc.sh is a shell script that is used for stressing and testing a
vmalloc subsystem as well as performance evaluation. You can find it here:

./tools/testing/selftests/vm/test_vmalloc.sh

As for workloads. Most of them which are critical to time and latency. For
example audio/video, especially in the mobile area. I did a big rework of
the KVA allocator because i found it not optimal to allocation time.

To be more specific, a high resolution audio playback was suffering from
the glitches due to a long allocation time and preemption issues.

--
Vlad Rezki



[Index of Archives]     [Kernel Archive]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]

  Powered by Linux