Re: [PATCH 5/6] MIPS: tlbex: Use ErrorEPC as scratch when KScratch isn't available

Ralf Baechle <ralf@xxxxxxxxxxxxxx> · Wed, 28 Jun 2017 17:25:35 +0200

On Thu, Jun 15, 2017 at 06:27:34PM +0100, Maciej W. Rozycki wrote:

> > This patch changes this such that when KScratch registers aren't
> > implemented we use the coprocessor 0 ErrorEPC register as scratch
> > instead. The only downside to this is that we will need to ensure that
> > TLB exceptions don't occur whilst handling error exceptions, or at least
> > before the handlers for such exceptions have read the ErrorEPC register.
> > As the kernel always runs unmapped, or using a wired TLB entry for
> > certain SGI ip27 configurations, this constraint is currently always
> > satisfied. In the future should the kernel become mapped we will need to
> > cover exception handling code with a wired entry anyway such that TLB
> > exception handlers don't themselves trigger TLB exceptions, so the
> > constraint should be satisfied there too.
> 
>  All error exception handlers run from (C)KSEG1 and with (X)KUSEG forcibly 
> unmapped, so a TLB exception could only ever happen with an access to the 
> kernel stack or static data located in (C)KSEG2 or XKSEG.  I think this 
> can be easily avoided, and actually should, to avoid cascading errors.
> 
>  Isn't the reverse a problem though, i.e. getting an error exception while 
> running a TLB exception handler and consequently getting the value stashed 
> in CP0.ErrorEPC clobbered?  Or do we assume all error exceptions are fatal 
> and the kernel shall panic without ever getting back?

Think of cache error exceptions for example.  Not all systems are as
bad as Pass 1 BCM1250 parts which were spewing like a few a day.  Without
going into hardware implementation details - memory parity or ECC errors
are on many systems are signaled as cache errors, thus clobering c0_errorepc.

So I think while it's a nice hack I think this patch should be reserved
for system that don't support parity or ECC or where generally a tiny bit
of performance is more important that reliability.

  Ralf