Re: [PATCH 2/2] spi: spi-geni-qcom: Really ensure the previous xfer is done before new one

Doug Anderson <dianders@xxxxxxxxxxxx> · Tue, 15 Dec 2020 09:25:51 -0800

Hi,

On Mon, Dec 14, 2020 at 6:57 PM Stephen Boyd <swboyd@xxxxxxxxxxxx> wrote:
>
> Quoting Douglas Anderson (2020-12-14 16:30:19)
> > diff --git a/drivers/spi/spi-geni-qcom.c b/drivers/spi/spi-geni-qcom.c
> > index 6f736e94e9f4..5ef2e9f38ac9 100644
> > --- a/drivers/spi/spi-geni-qcom.c
> > +++ b/drivers/spi/spi-geni-qcom.c
> > @@ -145,12 +145,49 @@ static void handle_fifo_timeout(struct spi_master *spi,
> >                 dev_err(mas->dev, "Failed to cancel/abort m_cmd\n");
> >  }
> >
> > +static int spi_geni_check_busy(struct spi_geni_master *mas)
>
> Maybe spi_geni_is_busy() and return bool?

Yeah, that's cleaner, thanks.

> > +       spin_lock_irq(&mas->lock);
> > +       m_irq = readl(se->base + SE_GENI_M_IRQ_STATUS);
> > +       m_irq_en = readl(se->base + SE_GENI_M_IRQ_EN);
> > +       spin_unlock_irq(&mas->lock);
> > +
> > +       if (m_irq & m_irq_en) {
>
> Is this really "busy" though? If we canceled something out then maybe
> the irq has fired but what if it's to tell us that we have some
> available space in the TX fifo? Does that really matter? It seems like
> if we have an RX irq when we're starting a transfer that might be bad
> too but we could forcibly clear that by acking it here and then setting
> the fifo word count that we're expecting for rx?
>
> Put another way, why isn't this driver looking at the TX and RX fifo
> status registers more than in one place?

I'm not sure I understand all your concerns.  Can you clarify?  In
case it helps, I'll add a few thoughts here:

1. SPI is a controller clocked protocol and this is the driver for the
controller.  There is no way to get a RX IRQ unless we initiate it.

2. The code always takes care to make sure that when we're done with a
transfer that we disable the TX watermark.  This means we won't get
any more interrupts.

The only time an interrupt could still be pending when we start a new
transfer is:

a) If the interrupt handler is still running on another CPU.  In that
case it will have the spinlock and won't release it until it clears
the interrupts.

b) If we had a timeout on the previous transfer and then got timeouts
sending the cancel and abort.

In general when we're starting a new transfer we assume that we can
program the hardware willy-nilly.  If there's some chance something
else is happening (or our interrupt could go off) then it breaks that
whole model.

I've addressed all your other concerns and I'm ready to send out v2,
but I'll hold off until you confirm that the above explanation
satisfies your questions.  If you can think of any extra comments
somewhere that would help document that better I'm happy to add them
into the commit or commit message.

-Doug