Am Do., 13. Dez. 2018 um 14:35 Uhr schrieb Stefan Roese <sr@xxxxxxx>: > > On 13.12.18 14:27, Daniel Schwierzeck wrote: > > Am Do., 13. Dez. 2018 um 11:09 Uhr schrieb Stefan Roese <sr@xxxxxxx>: > >> > >> Hi Daniel, > >> > >> On 13.12.18 02:00, Daniel Schwierzeck wrote: > >>> Am 12.12.18 um 09:18 schrieb Stefan Roese: > >>>> Hi! > >>>> > >>>> I've been hunting for a problem for quite some time, where Linux > >>>> hangs / crashes in userspace at some point on my MT7688 based > >>>> systems. I found that this problem can be avoided (worked around) > >>>> by not giving Linux the full memory (by using DT memory node fixup > >>>> or mem= kernel cmdline). When reducing this memory by the memory > >>>> used by U-Boot (stack pointer minus some KiB value as this is the > >>>> "lowest" memory used by U-Boot), then Linux runs just fine. > >>>> > >>>> My first idea here was, that this issue is cache related (most > >>>> likely I-cache). But all tests and debugging in this area did not > >>>> fix this issue (even running with caches disabled). > >>>> > >>>> Finally I found that this line in U-Boot makes Linux break: > >>>> > >>>> arch/mips/lib/traps.c: > >>>> > >>>> void trap_init(ulong reloc_addr) > >>>> unsigned long ebase = gd->irq_sp; > >>>> ... > >>>> write_c0_ebase(ebase); > >>>> > >>>> This sets EBase to something like 0x87e9b000 on my system (128MiB). > >>>> And Linux then re-uses this value and copies the exceptions handlers > >>>> to this address, overwriting random code and leading to an unstable > >>>> system. > >>>> > >>>> So my questions now is, how should this be handled on the MT7688 > >>>> platform instead? One way would be to set EBase back to the > >>>> original value (0x80000000) before booting into Linux. Another > >>>> solution would be to add some Linux code like board_ebase_setup() > >>>> to the MT7688 Linux port. > >>>> > >>>> Since I'm still no real MIPS expert yet, I would really like to get > >>>> some advise here on how to best solve this issue. Maybe I missed > >>>> something. Comments? > >>>> > >>>> Thanks, > >>>> Stefan > >>> > >>> the relevant code is in arch/mips/kernel/traps.c:trap_init(): > >>> > >>> Within the branch if (cpu_has_veic || cpu_has_vint) the kernel will > >>> allocate memory for the exception vectors and resets ebase to that memory. > >> > >> This branch currently is not taken on this SoC (Mediatek / Ralink). > >> > >>> In the else branch ebase is statically assigned to CAC_BASE which should > >>> resolve to 0x80000000 on Ralink platform. The ebase is only read from > >>> CP0 for MIPS r6 CPUs. > >> > >> Without CPU_MIPSR2_IRQ_VI being set (as its currently the case), this > >> is how this function is run: > >> > >> if (cpu_has_veic || cpu_has_vint) { > >> ... > >> } else { > >> *** this is true for Ralink / Mediatek > >> ... > >> if (cpu_has_mips_r2_r6) { > >> if (cpu_has_ebase_wg) { > >> ... > >> } else { > >> *** this is true for Ralink / Mediatek > >> ... > >> > >> So in summary, ebase is not allocated but assigned to this value: > >> > >> ebase = CAC_BASE + read_c0_ebase() & 0x3ffff000; > >> > >> which of course leads to this issues we observed. > >> > >>> So the ebase set by U-Boot shouldn't be relevant for Ralink platform. > >> > >> Why so? > >> > >>> More likely some code at 0x80000000 is overwritten when installing the > >>> exception handlers because all Ralink SoCs except MT7621 have > >>> 0xffffffff80000000 defined as load address. So adding something like > >>> 0x1000 should fix your problem too. > >> > >> Hmmm, not sure that I fully understand this. Could you please explain > >> again? > > > > oh sorry, I misread cpu_has_mips_r2_r6 to only catch MIPS r6 CPUs, but > > obviously it > > applies to MIPS r2 too. > > > >> > >>> AFAIK the CPU probing should detect and set cpu_has_veic accordingly. > >> > >> Yes, I agree. > >> > >>> Maybe it's a bug by Ralink to not set this bit. I guess that's why a > >>> platform could provide a cpu-feature-overrides.h. Or you could configure > >>> CPU_MIPSR2_IRQ_VI as Horatio stated in his response. > >> > >> I just checked in decode_config3() and MIPS_CPU_VEIC is not set on > >> this SoC (config3=00002420 MIPS_CONF3_VEIC=00000040). > > > > If vectored interrupt handlers are working on Ralink platform, than maybe this > > should be enabled via cpu-feature-overrides.h like the Lantiq platform is doing. > > AFAIU this should increase interrupt performance. > > Sure. If that's the preferred way to do it (compared to setting > CONFIG_CPU_MIPSR2_IRQ_VI), then I'll gladly submit a patch for it. > > >> > >>> @Paul regarding MIPS r6, is there some expectation of the bootloader to > >>> set ebase to a reasonable value or to not change the value at all? Maybe > >>> we need to fix U-Boot? > >> > >> Yes, some advise on how to fix this would be very welcome. I can easily > >> add CPU_MIPSR2_IRQ_VI and send a patch for this as well. > >> > > > > I could also prepare a U-Boot patch to restore the original ebase value before > > handing the control over to the OS. > > I'm not so sure, if overwriting 0x80000000 (default value of EBase on > this SoC) with the exception handler is allowed. Is this address "zero" > handled somewhat specific in MIPS Linux? AFAICT, the complete DDR > area on my platform (0x8000.0000 - 0x87ff.ffff) is available for Linux. > So allocating some memory for this exception handler seems the right > way to go to me. > maybe that's why some platforms define a load address of 0x80002000 or similar to protect this area somehow. -- - Daniel