On Fri, Sep 17, 2021 at 10:47:54AM +0200, David Hildenbrand wrote: > > > > This patch looks like a KASAN specific to me. So i would like to avoid > > > > such changes to > > > > the vmalloc internals in order to simplify further maintenance and > > > > keeping it generic > > > > instead. > > > > > > There is nothing KASAN specific in there :) It's specific to exact > > > applications -- and KASAN may be one of the only users for now. > > > > > Well, you can name it either way :) So it should not be specific by the > > design, otherwise as i mentioned the allocator would be like a peace of > > code that handles corner cases what is actually not acceptable. > > Let's not overstress the situation of adding essentially 3 LOC in order to > fix a sane use case of the vmalloc allocator that was not considered > properly by internal changes due to 68ad4a330433 ("mm/vmalloc.c: keep track > of free blocks for vmap allocation"). > > > > > > > > > > > Currently the find_vmap_lowest_match() adjusts the search size > > > > criteria for any alignment > > > > values even for PAGE_SIZE alignment. That is not optimal. Because a > > > > PAGE_SIZE or one > > > > page is a minimum granularity we operate on vmalloc allocations thus > > > > all free blocks are > > > > page aligned. > > > > > > > > All that means that we should adjust the search length only if an > > > > alignment request is bigger than > > > > one page, in case of one page, that corresponds to PAGE_SIZE value, > > > > there is no reason in such > > > > extra adjustment because all free->va_start have a page boundary anyway. > > > > > > > > Could you please test below patch that is a generic improvement? > > > > > > I played with the exact approach below (almost exactly the same code in > > > find_vmap_lowest_match()), and it might make sense as a general improvement > > > -- if we can guarantee that start+end of ranges are always PAGE-aligned; I > > > was not able to come to that conclusion... > > All free blocks are PAGE aligned that is how it has to be. A vstart also > > should be aligned otherwise the __vunmap() will complain about freeing > > a bad address: > > > > <snip> > > if (WARN(!PAGE_ALIGNED(addr), "Trying to vfree() bad address (%p)\n", > > addr)) > > return; > > <snip> > > > > BTW, we should add an extra check to the alloc_vmap_area(), something like > > below: > > > > <snip> > > if (!PAGE_ALIGNED(ALIGN(vstart, align))) { > > WARN_ONCE(1, "vmalloc: vstart or align are not page aligned (0x%lx, 0x%lx)\n", > > vstart, align); > > return ERR_PTR(-EBUSY); > > } > > <snip> > > > > to check that passed pair of vstart and align are correct. > > > > Yes we better should. > > > > > > > vmap_init_free_space() shows me that our existing alignment code/checks > > > might be quite fragile. > > > > > It is not important to page align a first address. As i mentioned before > > vstart and align have to be correct and we should add such check. > > > > > > > > But I mainly decided to go with my patch instead because it will also work > > > for exact allocations with align > PAGE_SIZE: most notably, when we try > > > population of hugepages instead of base pages in __vmalloc_node_range(), by > > > increasing the alignment. If the exact range allows for using huge pages, > > > placing huge pages will work just fine with my modifications, while it won't > > > with your approach. > > > > > > Long story short: my approach handles exact allocations also for bigger > > > alignment, Your optimization makes sense as a general improvement for any > > > vmalloc allocations. > > > > > > Thanks for having a look! > > > > > The problem is that there are no users(seems only KASAN) who set the range > > that corresponds to exact size. If you add an aligment overhead on top of > > So there is one user and it was broken by 68ad4a330433 ("mm/vmalloc.c: keep > track of free blocks for vmap allocation"). > > > it means that search size should include it because we may end up with exact > > free block and after applying aligment it will not fit. This is how allocators > > handle aligment. > > This is an implementation detail of the vmalloc allocator ever since > 68ad4a330433 ("mm/vmalloc.c: keep track of free blocks for vmap > allocation"). > > If callers pass an exact range, and the alignment they specify applies, why > should we reject such an allocation? It's leaking an implementation detail > fixed easily internally into callers. > > > > > Another approach is(you mentioned it in your commit message): > > > > urezki@pc638:~/data/raid0/coding/linux-next.git$ git diff mm/kasan/shadow.c > > diff --git a/mm/kasan/shadow.c b/mm/kasan/shadow.c > > index 082ee5b6d9a1..01d3bd76c851 100644 > > --- a/mm/kasan/shadow.c > > +++ b/mm/kasan/shadow.c > > @@ -200,7 +200,7 @@ static int __meminit kasan_mem_notifier(struct notifier_block *nb, > > if (shadow_mapped(shadow_start)) > > return NOTIFY_OK; > > - ret = __vmalloc_node_range(shadow_size, PAGE_SIZE, shadow_start, > > + ret = __vmalloc_node_range(shadow_size, 1, shadow_start, > > shadow_end, GFP_KERNEL, > > PAGE_KERNEL, VM_NO_GUARD, > > pfn_to_nid(mem_data->start_pfn), > > urezki@pc638:~/data/raid0/coding/linux-next.git$ > > > > why not? Also you can increase the range in KASAN code. > > No, that's leaking implementation details to the caller. And no, increasing > the range and eventually allocating something bigger (e.g., placing a huge > page where it might not have been possible) is not acceptable for KASAN. > > If you're terribly unhappy with this patch, Sorry to say but it simple does not make sense. > > please suggest something reasonable to fix exact allocations: > a) Fixes the KASAN use case. > b) Allows for automatic placement of huge pages for exact allocations. > c) Doesn't leak implementation details into the caller. > I am looking at it. -- Vlad Rezki