On Mon, Jan 27, 2020 at 12:09 AM Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote: > > On Sun, Jan 26, 2020 at 11:16:02PM -0800, Nick Desaulniers wrote: > > This helps avoid avoid a potentially large stack allocation. > > > > When building with: > > $ make CC=clang arch/x86/ CFLAGS=-Wframe-larger-than=1000 > > The following warning is observed: > > arch/x86/kernel/kvm.c:494:13: warning: stack frame size of 1064 bytes in > > function 'kvm_send_ipi_mask_allbutself' [-Wframe-larger-than=] > > static void kvm_send_ipi_mask_allbutself(const struct cpumask *mask, int > > vector) > > ^ > > Debugging with: > > https://github.com/ClangBuiltLinux/frame-larger-than > > via: > > $ python3 frame_larger_than.py arch/x86/kernel/kvm.o \ > > kvm_send_ipi_mask_allbutself > > points to the stack allocated `struct cpumask newmask` in > > `kvm_send_ipi_mask_allbutself`. The size of a `struct cpumask` is > > potentially large, as it's CONFIG_NR_CPUS divided by BITS_PER_LONG for > > the target architecture. CONFIG_NR_CPUS for X86_64 can be as high as > > 8192, making a single instance of a `struct cpumask` 1024 B. > > > > Signed-off-by: Nick Desaulniers <nick.desaulniers@xxxxxxxxx> > > --- > > arch/x86/kernel/kvm.c | 10 ++++++---- > > 1 file changed, 6 insertions(+), 4 deletions(-) > > > > diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c > > index 32ef1ee733b7..d41c0a0d62a2 100644 > > --- a/arch/x86/kernel/kvm.c > > +++ b/arch/x86/kernel/kvm.c > > @@ -494,13 +494,15 @@ static void kvm_send_ipi_mask(const struct cpumask *mask, int vector) > > static void kvm_send_ipi_mask_allbutself(const struct cpumask *mask, int vector) > > { > > unsigned int this_cpu = smp_processor_id(); > > - struct cpumask new_mask; > > Right, on stack cpumask is definitely dodgy. > > > + struct cpumask *new_mask; > > const struct cpumask *local_mask; > > > > - cpumask_copy(&new_mask, mask); > > - cpumask_clear_cpu(this_cpu, &new_mask); > > - local_mask = &new_mask; > > + new_mask = kmalloc(sizeof(*new_mask), GFP_KERNEL); Probably should check for allocation failure, d'oh! > > + cpumask_copy(new_mask, mask); > > + cpumask_clear_cpu(this_cpu, new_mask); > > + local_mask = new_mask; > > __send_ipi_mask(local_mask, vector); > > + kfree(new_mask); > > } > > One alternative approach is adding the inverse of cpu_bit_bitmap. I'm > not entirely sure how often we need the all-but-self mask, but ISTR > there were other places too.