Re: runtime check for omap-aes bus access permission (was: Re: 3.13-rc3 (commit 7ce93f3) breaks Nokia N900 DT boot)

Matthijs van Duin <matthijsvanduin@xxxxxxxxx> · Wed, 11 Feb 2015 16:22:51 +0100

On 11 February 2015 at 13:39, Pali Rohár <pali.rohar@xxxxxxxxx> wrote:
>> Anyhow, since checking the firewalls/APs to see if you have
>> permission will probably only get you yet another fault if
>> things are walled off, the robust way of dealing with this
>> sort of situation is by probing the device with a read while
>> trapping bus faults. This also handles modules that are
>> unreachable for other reasons, e.g. being disabled by eFuse.
>
> It is possible to patch kernel code to mask or ignore that fault?
> Can you help me with something like that?

As I mentioned, I'm still learning my way around the kernel, so I
don't feel very comfortable suggesting a concrete patch just yet. I've
been browsing arch/arm/mm/ however and my impression is that all that
would be required is editing fault.c by making a copy of do_bad but
containing
    return user_mode(regs) || !fixup_exception(regs);
and hook it onto the appropriate fault codes.  However, this really
needs the opinion of someone more familiar with this code.

I do have an observation to make on the issue of fault decoding: the
list in fsr-2level.c may be "standard ARMv3 and ARMv4 aborts" but they
are quite wrong for ARMv7 which has:

[ 0] -
[ 1] alignment fault
[ 2] debug event
[ 3] section access flag fault
[ 4] instruction cache maintainance fault (reported via data abort)
[ 5] section translation fault
[ 6] page access flag fault
[ 7] page translation fault
[ 8] bus error on access
[ 9] section domain fault
[10] -
[11] page domain fault
[12] bus error on section table walk
[13] section permission fault
[14] bus error on page table walk
[15] page permission fault
[16] (TLB conflict abort)
[17] -
[18] -
[19] -
[20] (lockdown abort)
[21] -
[22] async bus error (reported via data abort)
[23] -
[24] async parity/ECC error (reported via data abort)
[25] parity/ECC error on access
[26] (coprocessor abort)
[27] -
[28] parity/ECC error on section table walk
[29] -
[30] parity/ECC error on page table walk
[31] -

Some entries are patched up near the bottom of fault.c but many bogus
messages remain, for example the "on linefetch" vs "on non-linefetch"
is misleading since no such thing can be inferred from the fault
status on v7.  Also, the i-cache maintenance fault handling looks
wrong to me: it should fetch the actual fault status from IFSR (even
though the address still comes from DFSR) and dispatch based on that.

Async external aborts (async bus error and async parity/ECC error)
give you basically no info. DFAR will contain garbage hence displaying
it will confuse rather than enlighten, a traceback is pointless since
the instruction that caused the access is long retired, likewise
user_mode() doesn't matter since a transition to kernel space may have
happened after the access that cause the abort. Basically they should
be treated more as an IRQ than as a fault (note they can also be
masked just like irqs). In case of a bus error, it may be appropriate
to just warn about it, or perhaps send a signal to the current
process, although in the latter case it should have some means to
distinguish it from a synchronous bus error.

At least on the cortex-a8, a parity/ECC error (whether async or not)
is to be regarded as absolutely fatal.  Quoth the TRM: "No recovery is
possible. The abort handler must disable the caches, communicate the
fail directly with the external system, request a reboot."

Bit 10 no longer indicates an asynchronous (let alone imprecise)
fault.  Apart from the debug events and async aborts (and possibly
some implementation-defined aborts), all aborts listed are
synchronous, and DFAR/IFAR is valid. There's no technical obstruction
to make these trappable via the kernel exception handling mechanism.
(Though at least in case of parity/ECC errors one shouldn't.)
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html