Re: [PATCH v2 07/17] irqchip/gic: Atomically update affinity

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Marc,

On Fri, Sep 10, 2021 at 12:23 PM Marc Zyngier <maz@xxxxxxxxxx> wrote:
> On Thu, 09 Sep 2021 16:22:01 +0100,
> Geert Uytterhoeven <geert@xxxxxxxxxxxxxx> wrote:
> > On Wed, Jun 24, 2020 at 9:59 PM Marc Zyngier <maz@xxxxxxxxxx> wrote:
> > > The GIC driver uses a RMW sequence to update the affinity, and
> > > relies on the gic_lock_irqsave/gic_unlock_irqrestore sequences
> > > to update it atomically.
> > >
> > > But these sequences only expend into anything meaningful if
> > > the BL_SWITCHER option is selected, which almost never happens.
> > >
> > > It also turns out that using a RMW and locks is just as silly,
> > > as the GIC distributor supports byte accesses for the GICD_TARGETRn
> > > registers, which when used make the update atomic by definition.
> > >
> > > Drop the terminally broken code and replace it by a byte write.
> > >
> > > Fixes: 04c8b0f82c7d ("irqchip/gic: Make locking a BL_SWITCHER only feature")
> > > Cc: stable@xxxxxxxxxxxxxxx
> > > Signed-off-by: Marc Zyngier <maz@xxxxxxxxxx>
> >
> > Thanks for your patch, which is now commit 005c34ae4b44f085
> > ("irqchip/gic: Atomically update affinity"), to which I bisected a hard
> > lock-up during boot on the Renesas EMMA Mobile EV2-based KZM-A9-Dual
> > board, which has a dual Cortex-A9 with PL390.
> >
> > Despite the ARM Generic Interrupt Controller Architecture Specification
> > (both version 1.0 and 2.0) stating that the Interrupt Processor Targets
> > Registers are byte-accessible, the EMMA Mobile EV2 User's Manual
> > states that the interrupt registers can be accessed via the APB bus,
> > in 32-bit units.  Using byte accesses locks up the system.
>
> Urgh. That is definitely a pretty poor integration. How about the
> priority registers? I guess they suffer from the same issue...

Yes, they do.

> > Unfortunately I only have remote access to the board showing the
> > issue.  I did check that adding the writeb_relaxed() before the
> > writel_relaxed() that was used before also causes a lock-up, so the
> > issue is not an endian mismatch.
> > Looking at the driver history, these registers have always been
> > accessed using 32-bit accesses before.  As byte accesses lead
> > indeed to simpler code, I'm wondering if they had been tried before,
> > and caused issues before?
>
> Not that I know. A lock was probably fine on a two CPU system. Less so
> on a busy 8 CPU machine where interrupts are often migrated. The GIC
> architecture makes a point in not requiring locking for most of the
> registers that can be accessed concurrently.
>
> > Since you said the locking was bogus before, due to the reliance on
> > the BL_SWITCHER option, I'm not suggesting a plain revert, but I'm
> > wondering what kind of locking you suggest to use instead?
>
> There isn't much we can do aside from reintroducing the RMW+spinlock
> approach, and for real this time. It would have to be handled as a
> quirk though, as I'm not keen on reintroducing this for all systems.
>
> I wrote the patchlet below, which is totally untested. Please give it
> a go and let me know if it helps.

Thanks for your quick response!
Your solution works, after making a few small modifications.

> --- a/drivers/irqchip/irq-gic.c
> +++ b/drivers/irqchip/irq-gic.c
> @@ -774,6 +776,25 @@ static int gic_pm_init(struct gic_chip_data *gic)
>  #endif
>
>  #ifdef CONFIG_SMP
> +static void rmw_writeb(u8 bval, void __iomem *addr)
> +{
> +       static DEFINE_RAW_SPINLOCK(rmw_lock);
> +       unsigned long offset = (unsigned long)addr & ~3UL;

Please drop the tilde.

> +       unsigned long shift = offset * 8;

"unsigned int" is sufficient for offset and size.

> +       unsigned long flags;
> +       u32 val;
> +
> +       raw_spin_lock_irqsave(&rmw_lock, flags);
> +
> +       addr -= offset;
> +       val = readl_relaxed(addr);
> +       val &= ~(0xffUL << shift);

No need for the UL suffix.

> +       val |= (u32)bval << shift;

No need for the cast.

> +       writel_relaxed(val, addr);
> +
> +       raw_spin_unlock_irqrestore(&rmw_lock, flags);
> +}
> +
>  static int gic_set_affinity(struct irq_data *d, const struct cpumask *mask_val,
>                             bool force)
>  {

> @@ -1375,6 +1399,29 @@ static bool gic_check_eoimode(struct device_node *node, void __iomem **base)
>         return true;
>  }
>
> +static bool gic_enable_rmw_access(void *data)
> +{
> +       /*
> +        * The EMEV2 class of machines has a broken interconnect, and
> +        * locks up on accesses that are less than 32bit. So far, only
> +        * the affinity setting requires it.
> +        */
> +       if (of_machine_is_compatible("renesas,emev2")) {
> +               static_branch_enable(&needs_rmw_access);
> +               return true;
> +       }
> +
> +       return false;
> +}
> +
> +static const struct gic_quirk gic_quirks[] = {
> +       {
> +               .desc           = "Implementation with broken byte access",

The output would look better without capitalizing the first word.
I think you can drop the first two words, saving some space:

    GIC: enabling workaround for broken byte access

> +               .compatible     = "arm,pl390",
> +               .init           = gic_enable_rmw_access,
> +       },

Missing "{ /* sentinel */ }".

> +};
> +
>  static int gic_of_setup(struct gic_chip_data *gic, struct device_node *node)
>  {
>         if (!gic || !node)

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@xxxxxxxxxxxxxx

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds



[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux