Re: SMP regression on MIPS 34Kc (Lantiq VRX220)

James Hogan <james.hogan@xxxxxxxxxx> · Wed, 13 Jul 2016 12:11:09 +0100

Hi Felix,

On Fri, Jun 10, 2016 at 08:54:44PM +0200, Felix Fietkau wrote:
> Hi,
> 
> commit cccf34e9411c41b0cbfb41980fe55fc8e7c98fd2
> MIPS: c-r4k: Fix cache flushing for MT cores
> 
> This commit breaks SMP on Lantiq VRX220. When r4k_on_each_cpu is called
> for a cache flush, I get the following oops very early when user space 
> starts:
> 
> [    1.689913] CPU 0 Unable to handle kernel paging request at virtual address fffd7000, epc == 80028fd4, ra == 80096240
> [    1.699095] Oops[#1]:
> [    1.701365] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.4.12 #2
> [    1.707280] task: 805278d8 ti: 80518000 task.ti: 80518000
> [    1.712663] $ 0   : 00000000 00000003 80530000 fffd8000
> [    1.717884] $ 4   : fffd7000 00000000 00000001 00000000
> [    1.723105] $ 8   : 1100ff00 1000001f 00000000 00000001
> [    1.728328] $12   : eac0c6e6 00000000 00000000 7748c2b0
> [    1.733551] $16   : 82e6bdb0 00000000 80029a14 fffd7000
> [    1.738773] $20   : 80499e0c 8051a140 80498424 80520000
> [    1.743995] $24   : 00000000 80028fc0                  
> [    1.749217] $28   : 80518000 80519d08 80520000 80096240
> [    1.754441] Hi    : 013b114e
> [    1.757316] Lo    : 6eff374b
> [    1.760209] epc   : 80028fd4 r4k_blast_dcache_page_dc32+0x14/0xb8
> [    1.766299] ra    : 80096240 generic_smp_call_function_single_interrupt+0x160/0x200
> [    1.773941] Status: 1100fd02 KERNEL EXL 
> [    1.777858] Cause : 00800008 (ExcCode 02)
> [    1.781862] BadVA : fffd7000
> [    1.784734] PrId  : 00019556 (MIPS 34Kc)
> [    1.788648] Modules linked in:
> [    1.791706] Process swapper/0 (pid: 0, threadinfo=80518000, task=805278d8, tls=00000000)
> [    1.799789] Stack : 00000001 8007d3f4 00000000 80520000 80528cc0 80530620 00000000 00000000
>          00000001 8000e4f0 00010000 80a922a4 00000001 80520000 00000001 80071ad4
>          80490000 80071c64 00000024 00000100 8051a040 8051a068 8049844c 80498460
>          8051a140 80530620 8052470c 80a922a4 00000001 80520000 00000001 80520000
>          80490000 80076214 804965b0 fffedc9a 80520000 00200000 00000000 00000001
>          ...
> [    1.835300] Call Trace:
> [    1.837757] [<80028fd4>] r4k_blast_dcache_page_dc32+0x14/0xb8
> [    1.843506] [<80096240>] generic_smp_call_function_single_interrupt+0x160/0x200
> [    1.850834] [<8000e4f0>] ipi_call_interrupt+0x10/0x20
> [    1.855874] [<80071ad4>] handle_irq_event_percpu+0x78/0x1b4
> [    1.861445] [<80076214>] handle_percpu_irq+0x88/0xb8
> [    1.866397] [<800711a0>] generic_handle_irq+0x40/0x58
> [    1.871445] [<80015528>] do_IRQ+0x1c/0x2c
> [    1.875447] [<80002430>] ret_from_irq+0x0/0x4
> [    1.879796] [<80015494>] r4k_wait_irqoff+0x18/0x20
> [    1.884606] [<80069464>] cpu_startup_entry+0x164/0x1c4
> [    1.889740] [<80542c28>] start_kernel+0x4a4/0x4c4
> [    1.894416] 
> [    1.895878] 
> Code: 00002821  10000026  8c46ab2c <bc950000> bc950020  bc950040  bc950060  bc950080  bc9500a0 
> [    1.905894] ---[ end trace e17f61a2b2391e19 ]---
> 
> If I add a #ifndef CONFIG_MIPS_MT_SMP around the smp_call_function_many call,
> everything works fine again. From what I can tell, we should not do any IPIs
> for cache flushing on that device, since there are only two threads on the
> same core.

While it seems clear (and correct) that fixing cpu_foreign_map to
exclude sibling CPUs would resolve this issue (I have a patch for that),
that still doesn't explain why the IPI call would be problematic in the
first place.

This is probably getting called via r4k_flush_data_cache_page() on the
other thread.

The virtual address is a fixmap address, and it would appear to be
fixmap index 0x9, which appears to be mapped by __kmap_pgprot() based on
a page colour of 0x7.

The case that fits that pattern is the cache alias handling in
__flush_anon_page(), which does kmap_coherent() and
flush_data_cache_page() on the resulting address. kmap_coherent disables
preemption, and directly injects a wired entry into the TLB. An IPI call
on that address is clearly unsafe as it doesn't inject an entry into the
page tables so other CPUs wouldn't pick it up.

So my prediction is that you have PREEMPT_COUNT=n in your kernel config,
which is causing in_atomic() in r4k_flush_data_cache_page() to always
return false. This in_atomic() has been on my radar for a few weeks now,
but I hadn't figured out what it was meant to solve. Given that
preemption being disabled isn't technically a problem when doing the IPI
calls (we disable preemption around it anyway, it doesn't sleep), I'm
wondering if this is the case the in_atomic() is meant for.

So, should the flush_anon_page() be flushing other CPU's caches too? I
suspect it should for systems where address based cache ops aren't
globalized by hardware. MIPS/ImgTec CM based SMP systems *do* globalize
hit based cache ops in hardware, so they don't need an IPI call (I have
a patch for that too).

Does other MIPS (non MIPS Technologies / ImgTec) multi-core SMP hardware
exist which uses c-r4k.c, doesn't globalize hit cache ops in hardware,
is also vulnerable to cache aliasing, and which people care about
running with software cache alias handling?

If so, it should probably be fixed to work with IPI calls (inject into
page tables temporarily or do the kmap_coherent on each CPU).

If not, the in_atomic() should be fixed to detect this case when
PREEMPT_COUNT=n, or rather the caller should call the local version
somehow to make its intentions clear.

Thanks
James
Attachment:
signature.asc

Description: Digital signature