Re: [PATCH v4 bpf 0/4] vmalloc: bpf: introduce VM_ALLOW_HUGE_VMAP

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Excerpts from Linus Torvalds's message of April 22, 2022 1:44 am:
> On Thu, Apr 21, 2022 at 1:57 AM Nicholas Piggin <npiggin@xxxxxxxxx> wrote:
>>
>> Those were (AFAIKS) all in arch code though.
> 
> No Nick, they really weren't.
> 
> The bpf issue with VM_FLUSH_RESET_PERMS means that all your arguments
> are invalid, because this affected non-architecture code.

VM_FLUSH_RESET_PERMS was because bpf uses the arch module allocation 
code which was not capable of dealing with huge pages in the arch
specific direct map manipulation stuff was unable to deal with it.
An x86 bug.

> So the bpf case had two independent issues: one was just bpf doing a
> really bad job at making sure the executable mapping was sanely
> initialized.
> 
> But the other was an actual bug in that hugepage case for vmalloc.
> 
> And that bug was an issue on power too.

I missed it, which bug was that?

> 
> So your "this is purely an x86 issue" argument is simply wrong.
> Because I'm very much looking at that power code that says "oh,
> __module_alloc() needs more work".
> 
> Notice?

No I don't notice. More work to support huge allocations for
executable mappings, sure. But the arch's implementation explicitly
does not support that yet. That doesn't make huge vmalloc broken!
Ridiculous. It works fine.

> 
> Can these be fixed? Yes. But they can't be fixed by saying "oh, let's
> disable it on x86".

You did just effectively disable it on x86 though.

And why can't it be reverted on x86 until it's fixed on x86??

> Although it's probably true that at that point, some of the issues
> would no longer be nearly as noticeable.

There really aren't all these "issues" you're imagining. They
aren't noticable now, on power or s390, because they have
non-buggy HAVE_ARCH_HUGE_VMALLOC implementations.

If you're really going to insist on this will you apply this to fix 
(some of) the performance regressions it introduced?

Thanks,
Nick

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 6e5b4488a0c5..b555f17e84d5 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -8919,7 +8919,10 @@ void *__init alloc_large_system_hash(const char *tablename,
 				table = memblock_alloc_raw(size,
 							   SMP_CACHE_BYTES);
 		} else if (get_order(size) >= MAX_ORDER || hashdist) {
-			table = __vmalloc(size, gfp_flags);
+			if (IS_ENABLED(CONFIG_PPC) || IS_ENABLED(CONFIG_S390))
+				table = vmalloc_huge(size, gfp_flags);
+			else
+				table = __vmalloc(size, gfp_flags);
 			virt = true;
 			if (table)
 				huge = is_vm_area_hugepages(table);




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux