On Wed, 01 Aug 2018 11:55:06 PDT (-0700), tglx@xxxxxxxxxxxxx wrote:
On Wed, 25 Jul 2018, Christoph Hellwig wrote:
On Wed, Jul 25, 2018 at 12:18:39PM +0100, Marc Zyngier wrote:
> This feels odd. It means that you cannot have the following sequence:
>
> local_irq_disable();
> enable_irq(x); // where x is owned by a remote hart
>
> as smp_call_function_single() requires interrupts to be enabled.
>
> More fundamentally, why are you trying to make these interrupts look
> global while they aren't? arm/arm64 have similar restrictions with GICv2
> and earlier, and treats these interrupts as per-cpu.
>
> Given that the drivers that deal with drivers connected to the per-hart
> irqchip are themselves likely to be aware of the per-cpu aspect, it
> would make sense to align things (we've been through that same
> discussion about the clocksource driver a few weeks back).
Right now the only direct consumers are said clocksource, the PLIC
driver later in this series and the RISC-V arch IPI code. None of them
is going to do a manual enable_irq, so I guess the remote case of the
code is simply dead code. I'll take a look at converting them to
per-cpu. I guess the GICv2 driver is the best template?
Confused. The timer and the IPI are separate causes and have nothing to do
with the per cpu irq domain. That's what the low level interrupt handling
code tells me.
If I understand correctly then the per cpu irq domain is for device
interrupts, right? If so, then this simply cannot work and there is no way
to make it work with per cpu interrupts either.
Is there some high level documentation about the design (or the lack of) or
can someone give a concise explanation how this stuff is supposed to work?
The ISA spec defines this, and I'm not sure I can be a whole lot more concise
than what's there but I'll try:
This driver is for the ISA-mandated interrupts present on all RISC-V systems.
These interrupts are tracked on a per-hart (hart is a RISC-V term that means
hardware thread, it a hyperthread in Intel land) basis.
On RISC-V systems there is a set of CSRs (control and status registers) local
to each hart. These registers can be accessed by special instructions, and are
used to perform hart-local actions that are of a medium speed: things like
accessing the floating point exception state or changing the page table base
pointer. When a hart is in supervisor mode, it has access to a handful of CSRs
that are related to interrupts:
* stvec: The trap vector, which is the entry point for both interrupts and
exceptions.
* sie: Supervisor Interrupts Enabled. This is a bit mask of enabled
interrupts.
* sip: Supervisor Interrupts Pending. This is a bit mask of pending
interrupts, which can be polled when interrupts are disabled (or I guess when
they're enabled too, but that's not particularly useful).
* scause: Supervisor Trap Cause. This allows code to determine what sort of
trap was taken.
* sepc: Supervisor Exception PC (actually the PC for all traps). This allows
code to determine how to get back to where it needs to continue after
handling a trap.
* stval: Supervisor Trap VALue. This provides additional trap-specific
information: for example, if a trap says "you attempted to write to an
address that isn't mapped or is read only" then stval contains the address
that the write was directed at.
As such, the ISA itself doesn't really differentiate between exceptions and
interrupts: they both cause a privilege transition to happen, redirect
execution to stvec, and write a handful of CSRs as side effects. It's up to
software to determine the difference.
In order to make it fast for software to determine if a scause value is an
exception or an interrupt that is encoded in the high bit. There are 11
defined exceptions, which include system calls, breakpoints, page faults, and
access faults -- that's all handled in core RISC-V arch code. Additionally,
there are 6 defined interrupt types:
* User software interrupt
* User timer interrupt
* User external interrupt
* Supervisor software interrupt
* Supervisor timer interrupt
* Supervisor external interrupt
The user interrupts are unsupported by Linux and thus never happen -- they're
really designed for real-time embedded code, in POSIX land we use the standard
signal delivery mechanisms as it'd be insane to put that in an ISA. Thus,
functionally there are three interrupt types:
* Supervisor software interrupt: These are never triggered by a hardware
device, and are therefor left for software use. On Linux we use these to
indicate that a message may have arrived in the per-hart message queue, which
is used to implement IPIs.
* Supervisor timer interrupt: When the RISC-V ISA was originally defined we
thought that it was important to mandate that accurate real-time facilities
exist, so there's a handful of time-related bits encoded into the ISA. In
retrospect it seems like this didn't end up being so clean as it's becoming a
burden when doing things like formally specifying the ISA, but we have
workarounds for these issues so we don't deem it worth re-spinning the ISA.
On Linux this is attached to a clocksource driver, with the only extant
example being SiFive's clock driver.
* Supervisor external interrupt: Essentially all the interrupts, aside from the
timer being a special case. This is what's attached to the proper interrupt
controller, with the only extant example being SiFive's PLIC (which was
mistakenly called a RISC-V plic in the device tree, but there's another
thread going about fixing that).
Originally this hart-local interrupt code was included in the RISC-V arch port,
right along our exception handling code. This matches up well because RISC-V
systems don't really differentiate between interrupts and exceptions, so it
fits in well. The trouble is that it smells like an interrupt controller and
doesn't follow the standard mechanisms in Linux.
As part of our original push to upstream the arch code it was suggested that we
move this out to an irqchip driver, but after actually attempting to do that it
appears that the mechanics of doing so have overshadowed the complexity of the
actual interrupt handling code, which is only a dozen or so instructions. In
retrospect this is just another instance of me not knowing what I'm doing,
sorry!
I like Christoph's approach of merging the ISA-mandated interrupt handling code
back into arch/riscv, as it's much saner that way. The one big headache is
that because we special-cased timer interrupts in the ISA they now disappear
from the standard Linux mechanisms for handling these.
That said, I'd rather have this warts and all then wait around for something
perfect, as maintaining a dozen or so out of tree drivers that are tightly
coupled to the core arch code has proven to be a mess. If we can get the code
upstream then everyone will be on the same page so we can work on actually
improving this, as opposed to just spinning our wheels keeping this big mess
alive.
Hopefully that makes some amount of sense...
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html