Re: [PATCH] spi: fsi: Implement a timeout for polling status

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 3/17/22 23:19, Joel Stanley wrote:
On Thu, 17 Mar 2022 at 21:14, Eddie James <eajames@xxxxxxxxxxxxx> wrote:
The data transfer routines must poll the status register to
determine when more data can be shifted in or out. If the hardware
gets into a bad state, these polling loops may never exit. Prevent
this by returning an error if a timeout is exceeded.
This makes sense. We may even want to put this code in regardless.

However, I'm wondering why the code in fsi_spi_status didn't catch this.


Same, which is why I thought the problem couldn't be happening here for a long time. See below with what I think is going on.



static int fsi_spi_status(struct fsi_spi *ctx, u64 *status, const char *dir)
{
        int rc = fsi_spi_read_reg(ctx, SPI_FSI_STATUS, status);
You mentioned the error condition is we get back 0xff. That means that
status will be 0xffff_ffff_ffff_ffff ?

Did you observe status being this value?


No, I think my observation of 0xff is not universal. I suspect that while the CFAM is IN reset, 0xff is returned, but once it's been reset, valid (though maybe uninitialized) data is returned. I observed a status of 0x0001100000000000, which means the controller is idle, which makes sense since it's been reset. So the issue occurs if we start an operation before a CFAM reset and are waiting for it to complete during the CFAM reset, but we also don't get any failed/invalid data FSI operations during/after the reset (very timing dependent - the FSI master does lock during the reset but doesn't wait afterwards for the hardware to initialize).



        if (rc)
                return rc;

        if (*status & SPI_FSI_STATUS_ANY_ERROR) {
I think that we're checking against 0xffe0f000.

                dev_err(ctx->dev, "%s error: %016llx\n", dir, *status);

                rc = fsi_spi_reset(ctx);
                if (rc)
                        return rc;
Is the problem here? fsi_spi_reset writes to the clock config
registers, but doesn't read the status.

Obviously doing the writes causes a call to fsi_spi_check_status, but
that checks the FSI2SPI bridge, not the SPI master.

...but it doesn't matter, because we're either going to return an
error from the reset, or return EREMOTEIO, so there's no masking of
the error.


Not sure I follow. I don't think we were hitting this path in this error scenario. Do you think we need to check the status after a reset? It should always be the same.



                return -EREMOTEIO;
        }

        return 0;
}

Signed-off-by: Eddie James <eajames@xxxxxxxxxxxxx>
---
  drivers/spi/spi-fsi.c | 10 ++++++++++
  1 file changed, 10 insertions(+)

diff --git a/drivers/spi/spi-fsi.c b/drivers/spi/spi-fsi.c
index b6c7467f0b59..d403a7a3021d 100644
--- a/drivers/spi/spi-fsi.c
+++ b/drivers/spi/spi-fsi.c
@@ -25,6 +25,7 @@

  #define SPI_FSI_BASE                   0x70000
  #define SPI_FSI_INIT_TIMEOUT_MS                1000
+#define SPI_FSI_STATUS_TIMEOUT_MS      100
Can you add a comment (or put something in the commit message) about
why you chose 100ms.


Hm, sure, but I chose it pretty arbitrarily. I'm not sure how to choose something like this.



  #define SPI_FSI_MAX_RX_SIZE            8
  #define SPI_FSI_MAX_TX_SIZE            40

@@ -299,6 +300,7 @@ static int fsi_spi_transfer_data(struct fsi_spi *ctx,
                                  struct spi_transfer *transfer)
  {
         int rc = 0;
+       unsigned long end;
         u64 status = 0ULL;

         if (transfer->tx_buf) {
@@ -315,10 +317,14 @@ static int fsi_spi_transfer_data(struct fsi_spi *ctx,
                         if (rc)
                                 return rc;

+                       end = jiffies + msecs_to_jiffies(SPI_FSI_STATUS_TIMEOUT_MS);
                         do {
                                 rc = fsi_spi_status(ctx, &status, "TX");
                                 if (rc)
                                         return rc;
+
+                               if (time_after(jiffies, end))
+                                       return -ETIMEDOUT;
                         } while (status & SPI_FSI_STATUS_TDR_FULL);

                         sent += nb;
@@ -329,10 +335,14 @@ static int fsi_spi_transfer_data(struct fsi_spi *ctx,
                 u8 *rx = transfer->rx_buf;

                 while (transfer->len > recv) {
+                       end = jiffies + msecs_to_jiffies(SPI_FSI_STATUS_TIMEOUT_MS);
                         do {
                                 rc = fsi_spi_status(ctx, &status, "RX");
                                 if (rc)
                                         return rc;
+
+                               if (time_after(jiffies, end))
+                                       return -ETIMEDOUT;
                         } while (!(status & SPI_FSI_STATUS_RDR_FULL));

                         rc = fsi_spi_read_reg(ctx, SPI_FSI_DATA_RX, &in);
--
2.27.0




[Index of Archives]     [Linux Kernel]     [Linux ARM (vger)]     [Linux ARM MSM]     [Linux Omap]     [Linux Arm]     [Linux Tegra]     [Fedora ARM]     [Linux for Samsung SOC]     [eCos]     [Linux Fastboot]     [Gcc Help]     [Git]     [DCCP]     [IETF Announce]     [Security]     [Linux MIPS]     [Yosemite Campsites]

  Powered by Linux