Hi Geert, On Thu, 09 Sep 2021 16:22:01 +0100, Geert Uytterhoeven <geert@xxxxxxxxxxxxxx> wrote: > > Hi Marc, Russell, > > On Wed, Jun 24, 2020 at 9:59 PM Marc Zyngier <maz@xxxxxxxxxx> wrote: > > The GIC driver uses a RMW sequence to update the affinity, and > > relies on the gic_lock_irqsave/gic_unlock_irqrestore sequences > > to update it atomically. > > > > But these sequences only expend into anything meaningful if > > the BL_SWITCHER option is selected, which almost never happens. > > > > It also turns out that using a RMW and locks is just as silly, > > as the GIC distributor supports byte accesses for the GICD_TARGETRn > > registers, which when used make the update atomic by definition. > > > > Drop the terminally broken code and replace it by a byte write. > > > > Fixes: 04c8b0f82c7d ("irqchip/gic: Make locking a BL_SWITCHER only feature") > > Cc: stable@xxxxxxxxxxxxxxx > > Signed-off-by: Marc Zyngier <maz@xxxxxxxxxx> > > Thanks for your patch, which is now commit 005c34ae4b44f085 > ("irqchip/gic: Atomically update affinity"), to which I bisected a hard > lock-up during boot on the Renesas EMMA Mobile EV2-based KZM-A9-Dual > board, which has a dual Cortex-A9 with PL390. > > Despite the ARM Generic Interrupt Controller Architecture Specification > (both version 1.0 and 2.0) stating that the Interrupt Processor Targets > Registers are byte-accessible, the EMMA Mobile EV2 User's Manual > states that the interrupt registers can be accessed via the APB bus, > in 32-bit units. Using byte accesses locks up the system. Urgh. That is definitely a pretty poor integration. How about the priority registers? I guess they suffer from the same issue... > Unfortunately I only have remote access to the board showing the > issue. I did check that adding the writeb_relaxed() before the > writel_relaxed() that was used before also causes a lock-up, so the > issue is not an endian mismatch. > Looking at the driver history, these registers have always been > accessed using 32-bit accesses before. As byte accesses lead > indeed to simpler code, I'm wondering if they had been tried before, > and caused issues before? Not that I know. A lock was probably fine on a two CPU system. Less so on a busy 8 CPU machine where interrupts are often migrated. The GIC architecture makes a point in not requiring locking for most of the registers that can be accessed concurrently. > Since you said the locking was bogus before, due to the reliance on > the BL_SWITCHER option, I'm not suggesting a plain revert, but I'm > wondering what kind of locking you suggest to use instead? There isn't much we can do aside from reintroducing the RMW+spinlock approach, and for real this time. It would have to be handled as a quirk though, as I'm not keen on reintroducing this for all systems. I wrote the patchlet below, which is totally untested. Please give it a go and let me know if it helps. M. diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c index d329ec3d64d8..dca40a974b7a 100644 --- a/drivers/irqchip/irq-gic.c +++ b/drivers/irqchip/irq-gic.c @@ -107,6 +107,8 @@ static DEFINE_RAW_SPINLOCK(cpu_map_lock); #endif +static DEFINE_STATIC_KEY_FALSE(needs_rmw_access); + /* * The GIC mapping of CPU interfaces does not necessarily match * the logical CPU numbering. Let's use a mapping as returned @@ -774,6 +776,25 @@ static int gic_pm_init(struct gic_chip_data *gic) #endif #ifdef CONFIG_SMP +static void rmw_writeb(u8 bval, void __iomem *addr) +{ + static DEFINE_RAW_SPINLOCK(rmw_lock); + unsigned long offset = (unsigned long)addr & ~3UL; + unsigned long shift = offset * 8; + unsigned long flags; + u32 val; + + raw_spin_lock_irqsave(&rmw_lock, flags); + + addr -= offset; + val = readl_relaxed(addr); + val &= ~(0xffUL << shift); + val |= (u32)bval << shift; + writel_relaxed(val, addr); + + raw_spin_unlock_irqrestore(&rmw_lock, flags); +} + static int gic_set_affinity(struct irq_data *d, const struct cpumask *mask_val, bool force) { @@ -788,7 +809,10 @@ static int gic_set_affinity(struct irq_data *d, const struct cpumask *mask_val, if (cpu >= NR_GIC_CPU_IF || cpu >= nr_cpu_ids) return -EINVAL; - writeb_relaxed(gic_cpu_map[cpu], reg); + if (static_branch_unlikely(&needs_rmw_access)) + rmw_writeb(gic_cpu_map[cpu], reg); + else + writeb_relaxed(gic_cpu_map[cpu], reg); irq_data_update_effective_affinity(d, cpumask_of(cpu)); return IRQ_SET_MASK_OK_DONE; @@ -1375,6 +1399,29 @@ static bool gic_check_eoimode(struct device_node *node, void __iomem **base) return true; } +static bool gic_enable_rmw_access(void *data) +{ + /* + * The EMEV2 class of machines has a broken interconnect, and + * locks up on accesses that are less than 32bit. So far, only + * the affinity setting requires it. + */ + if (of_machine_is_compatible("renesas,emev2")) { + static_branch_enable(&needs_rmw_access); + return true; + } + + return false; +} + +static const struct gic_quirk gic_quirks[] = { + { + .desc = "Implementation with broken byte access", + .compatible = "arm,pl390", + .init = gic_enable_rmw_access, + }, +}; + static int gic_of_setup(struct gic_chip_data *gic, struct device_node *node) { if (!gic || !node) @@ -1391,6 +1438,8 @@ static int gic_of_setup(struct gic_chip_data *gic, struct device_node *node) if (of_property_read_u32(node, "cpu-offset", &gic->percpu_offset)) gic->percpu_offset = 0; + gic_enable_of_quirks(node, gic_quirks, gic); + return 0; error: -- Without deviation from the norm, progress is not possible.