Re: [PATCH 1/3] perf/core: Make sure the ring-buffer is mapped in all page-tables

Andy Lutomirski <luto@xxxxxxxxxx> · Fri, 20 Jul 2018 12:32:10 -0700

On Fri, Jul 20, 2018 at 10:48 AM, Joerg Roedel <joro@xxxxxxxxxx> wrote:
> On Fri, Jul 20, 2018 at 10:06:54AM -0700, Andy Lutomirski wrote:
>> > On Jul 20, 2018, at 6:22 AM, Joerg Roedel <joro@xxxxxxxxxx> wrote:
>> >
>> > From: Joerg Roedel <jroedel@xxxxxxx>
>> >
>> > The ring-buffer is accessed in the NMI handler, so we better
>> > avoid faulting on it. Sync the vmalloc range with all
>> > page-tables in system to make sure everyone has it mapped.
>> >
>> > This fixes a WARN_ON_ONCE() that can be triggered with PTI
>> > enabled on x86-32:
>> >
>> >    WARNING: CPU: 4 PID: 0 at arch/x86/mm/fault.c:320 vmalloc_fault+0x220/0x230
>> >
>> > This triggers because with PTI enabled on an PAE kernel the
>> > PMDs are no longer shared between the page-tables, so the
>> > vmalloc changes do not propagate automatically.
>>
>> It seems like it would be much more robust to fix the vmalloc_fault()
>> code instead.
>
> The question is whether the NMI path is nesting-safe, then we can remove
> the WARN_ON_ONCE(in_nmi()) in the vmalloc_fault path. It should be
> nesting-safe on x86-32 because of the way the stack-switch happens
> there. If its also nesting-safe on x86-64 the warning there can be
> removed.
>
> Or did you think of something else to fix there?

I'm just reading your changelog, and you said the PMDs are no longer
shared between the page tables.  So this presumably means that
vmalloc_fault() no longer actually works correctly on PTI systems.  I
didn't read the code to figure out *why* it doesn't work, but throwing
random vmalloc_sync_all() calls around is wrong.

Or maybe the bug really just is the warning.  The warning can probably go.

>
>
> Thanks,
>
>         Joerg
>