On Thu, Jun 15, 2017 at 06:27:34PM +0100, Maciej W. Rozycki wrote: > > This patch changes this such that when KScratch registers aren't > > implemented we use the coprocessor 0 ErrorEPC register as scratch > > instead. The only downside to this is that we will need to ensure that > > TLB exceptions don't occur whilst handling error exceptions, or at least > > before the handlers for such exceptions have read the ErrorEPC register. > > As the kernel always runs unmapped, or using a wired TLB entry for > > certain SGI ip27 configurations, this constraint is currently always > > satisfied. In the future should the kernel become mapped we will need to > > cover exception handling code with a wired entry anyway such that TLB > > exception handlers don't themselves trigger TLB exceptions, so the > > constraint should be satisfied there too. > > All error exception handlers run from (C)KSEG1 and with (X)KUSEG forcibly > unmapped, so a TLB exception could only ever happen with an access to the > kernel stack or static data located in (C)KSEG2 or XKSEG. I think this > can be easily avoided, and actually should, to avoid cascading errors. > > Isn't the reverse a problem though, i.e. getting an error exception while > running a TLB exception handler and consequently getting the value stashed > in CP0.ErrorEPC clobbered? Or do we assume all error exceptions are fatal > and the kernel shall panic without ever getting back? Think of cache error exceptions for example. Not all systems are as bad as Pass 1 BCM1250 parts which were spewing like a few a day. Without going into hardware implementation details - memory parity or ECC errors are on many systems are signaled as cache errors, thus clobering c0_errorepc. So I think while it's a nice hack I think this patch should be reserved for system that don't support parity or ECC or where generally a tiny bit of performance is more important that reliability. Ralf