Re: [PATCH 1/3] perf/core: Make sure the ring-buffer is mapped in all page-tables

Andy Lutomirski <luto@xxxxxxxxxx> · Fri, 20 Jul 2018 12:33:14 -0700

On Fri, Jul 20, 2018 at 12:27 PM, Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
> On Fri, 20 Jul 2018, Andy Lutomirski wrote:
>> > On Jul 20, 2018, at 6:22 AM, Joerg Roedel <joro@xxxxxxxxxx> wrote:
>> >
>> > From: Joerg Roedel <jroedel@xxxxxxx>
>> >
>> > The ring-buffer is accessed in the NMI handler, so we better
>> > avoid faulting on it. Sync the vmalloc range with all
>> > page-tables in system to make sure everyone has it mapped.
>> >
>> > This fixes a WARN_ON_ONCE() that can be triggered with PTI
>> > enabled on x86-32:
>> >
>> >    WARNING: CPU: 4 PID: 0 at arch/x86/mm/fault.c:320 vmalloc_fault+0x220/0x230
>> >
>> > This triggers because with PTI enabled on an PAE kernel the
>> > PMDs are no longer shared between the page-tables, so the
>> > vmalloc changes do not propagate automatically.
>>
>> It seems like it would be much more robust to fix the vmalloc_fault()
>> code instead.
>
> Right, but now the obvious fix for the issue at hand is this. We surely
> should revisit this.

If you commit this under this reasoning, then please at least make it say:

/* XXX: The vmalloc_fault() code is buggy on PTI+PAE systems, and this
is a workaround. */

Let's not have code in the kernel that pretends to make sense but is
actually voodoo magic that works around bugs elsewhere.  It's no fun
to maintain down the road.