Re: MIPS (mt7688): EBase change in U-Boot breaks Linux

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Am Do., 13. Dez. 2018 um 14:35 Uhr schrieb Stefan Roese <sr@xxxxxxx>:
>
> On 13.12.18 14:27, Daniel Schwierzeck wrote:
> > Am Do., 13. Dez. 2018 um 11:09 Uhr schrieb Stefan Roese <sr@xxxxxxx>:
> >>
> >> Hi Daniel,
> >>
> >> On 13.12.18 02:00, Daniel Schwierzeck wrote:
> >>> Am 12.12.18 um 09:18 schrieb Stefan Roese:
> >>>> Hi!
> >>>>
> >>>> I've been hunting for a problem for quite some time, where Linux
> >>>> hangs / crashes in userspace at some point on my MT7688 based
> >>>> systems. I found that this problem can be avoided (worked around)
> >>>> by not giving Linux the full memory (by using DT memory node fixup
> >>>> or mem= kernel cmdline). When reducing this memory by the memory
> >>>> used by U-Boot (stack pointer minus some KiB value as this is the
> >>>> "lowest" memory used by U-Boot), then Linux runs just fine.
> >>>>
> >>>> My first idea here was, that this issue is cache related (most
> >>>> likely I-cache). But all tests and debugging in this area did not
> >>>> fix this issue (even running with caches disabled).
> >>>>
> >>>> Finally I found that this line in U-Boot makes Linux break:
> >>>>
> >>>> arch/mips/lib/traps.c:
> >>>>
> >>>> void trap_init(ulong reloc_addr)
> >>>>       unsigned long ebase = gd->irq_sp;
> >>>>       ...
> >>>>       write_c0_ebase(ebase);
> >>>>
> >>>> This sets EBase to something like 0x87e9b000 on my system (128MiB).
> >>>> And Linux then re-uses this value and copies the exceptions handlers
> >>>> to this address, overwriting random code and leading to an unstable
> >>>> system.
> >>>>
> >>>> So my questions now is, how should this be handled on the MT7688
> >>>> platform instead? One way would be to set EBase back to the
> >>>> original value (0x80000000) before booting into Linux. Another
> >>>> solution would be to add some Linux code like board_ebase_setup()
> >>>> to the MT7688 Linux port.
> >>>>
> >>>> Since I'm still no real MIPS expert yet, I would really like to get
> >>>> some advise here on how to best solve this issue. Maybe I missed
> >>>> something. Comments?
> >>>>
> >>>> Thanks,
> >>>> Stefan
> >>>
> >>> the relevant code is in arch/mips/kernel/traps.c:trap_init():
> >>>
> >>> Within the branch if (cpu_has_veic || cpu_has_vint) the kernel will
> >>> allocate memory for the exception vectors and resets ebase to that memory.
> >>
> >> This branch currently is not taken on this SoC (Mediatek / Ralink).
> >>
> >>> In the else branch ebase is statically assigned to CAC_BASE which should
> >>> resolve to 0x80000000 on Ralink platform. The ebase is only read from
> >>> CP0 for MIPS r6 CPUs.
> >>
> >> Without CPU_MIPSR2_IRQ_VI being set (as its currently the case), this
> >> is how this function is run:
> >>
> >>          if (cpu_has_veic || cpu_has_vint) {
> >>                  ...
> >>          } else {
> >>                  *** this is true for Ralink / Mediatek
> >>                  ...
> >>                  if (cpu_has_mips_r2_r6) {
> >>                          if (cpu_has_ebase_wg) {
> >>                                  ...
> >>                          } else {
> >>                                  *** this is true for Ralink / Mediatek
> >>                                  ...
> >>
> >> So in summary, ebase is not allocated but assigned to this value:
> >>
> >>          ebase = CAC_BASE + read_c0_ebase() & 0x3ffff000;
> >>
> >> which of course leads to this issues we observed.
> >>
> >>> So the ebase set by U-Boot shouldn't be relevant for Ralink platform.
> >>
> >> Why so?
> >>
> >>> More likely some code at 0x80000000 is overwritten when installing the
> >>> exception handlers because all Ralink SoCs except MT7621 have
> >>> 0xffffffff80000000 defined as load address. So adding something like
> >>> 0x1000 should fix your problem too.
> >>
> >> Hmmm, not sure that I fully understand this. Could you please explain
> >> again?
> >
> > oh sorry, I misread cpu_has_mips_r2_r6 to only catch MIPS r6 CPUs, but
> > obviously it
> > applies to MIPS r2 too.
> >
> >>
> >>> AFAIK the CPU probing should detect and set cpu_has_veic accordingly.
> >>
> >> Yes, I agree.
> >>
> >>> Maybe it's a bug by Ralink to not set this bit. I guess that's why a
> >>> platform could provide a cpu-feature-overrides.h. Or you could configure
> >>> CPU_MIPSR2_IRQ_VI as Horatio stated in his response.
> >>
> >> I just checked in decode_config3() and MIPS_CPU_VEIC is not set on
> >> this SoC (config3=00002420 MIPS_CONF3_VEIC=00000040).
> >
> > If vectored interrupt handlers are working on Ralink platform, than maybe this
> > should be enabled via cpu-feature-overrides.h like the Lantiq platform is doing.
> > AFAIU this should increase interrupt performance.
>
> Sure. If that's the preferred way to do it (compared to setting
> CONFIG_CPU_MIPSR2_IRQ_VI), then I'll gladly submit a patch for it.
>
> >>
> >>> @Paul regarding MIPS r6, is there some expectation of the bootloader to
> >>> set ebase to a reasonable value or to not change the value at all? Maybe
> >>> we need to fix U-Boot?
> >>
> >> Yes, some advise on how to fix this would be very welcome. I can easily
> >> add CPU_MIPSR2_IRQ_VI and send a patch for this as well.
> >>
> >
> > I could also prepare a U-Boot patch to restore the original ebase value before
> > handing the control over to the OS.
>
> I'm not so sure, if overwriting 0x80000000 (default value of EBase on
> this SoC) with the exception handler is allowed. Is this address "zero"
> handled somewhat specific in MIPS Linux? AFAICT, the complete DDR
> area on my platform (0x8000.0000 - 0x87ff.ffff) is available for Linux.
> So allocating some memory for this exception handler seems the right
> way to go to me.
>

maybe that's why some platforms define a load address of 0x80002000 or similar
to protect this area somehow.

-- 
- Daniel


[Index of Archives]     [Linux MIPS Home]     [LKML Archive]     [Linux ARM Kernel]     [Linux ARM]     [Linux]     [Git]     [Yosemite News]     [Linux SCSI]     [Linux Hams]

  Powered by Linux