RE: PATCH Prevent spi-dw write transaction split

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Mark,
>+ Please wrap lines at less than 80 columns.
Attached is new report & patch, where previously the comment in spi-dw.c exceeded 80 chars per line. Now fixed.

>+ That's going to be a pretty long interrupts disabled period added unconditionally which is a bit rude for other users.
>+ As Trent said the normal thing with hardware with half baked chip select control like this is to put the chip select into GPIO mode (or use a separate GPIO if there's not enough pinmux flexibility).
>+ If we do decide to go down this road we should ensure that those users don't have to pay the cost of the interrupt disabled period and only disable the interrupts when using the native chip select
Initially we wanted to use a specific GPI but we have 4 SPI devices and I was told (by our DSP person that it was a dead end), The disable-period is for the "memcpy of bytes or shorts" to spi-fifo
Which is limited by DesignWare-SPI FIFO size (256 bytes)
.
 >+ You'll probably also run into trouble with RT usage as you can't actually disable interrupts there
The problem has initially detected in 4.9.76-rt (with PREEMPT_RT) and the patch solve the problem there.
Also seen in 4.9.78 (without -rt), and work solves the issue also there!

Gil Beniamini
Gilat Satellite Networks - R&D
gilb@xxxxxxxxx
Phone: (972)-3-9252427 Fax:(972)-3-9293240

-----Original Message-----
From: Mark Brown [mailto:broonie@xxxxxxxxxx]
Sent: Monday, November 19, 2018 2:57 PM
To: Gil Beniamini <GilB@xxxxxxxxx>
Cc: thor.thayer@xxxxxxxxxxxxxxx; linux-spi@xxxxxxxxxxxxxxx; Shay Dahan <ShayD@xxxxxxxxx>
Subject: Re: PATCH Prevent spi-dw write transaction split

On Sun, Nov 18, 2018 at 08:45:06AM +0000, Gil Beniamini wrote:

> Sorry I can only use Microsoft Office, and so I created an attachment,
> hopefully in acceptible format (I am not sure about who toname as
> "signed-off"), See attached files

Your signoff should be from yourself so what you've got is fine and now the tab/spaces thing is sorted which is good but the format is still not a normal kernel patch format - you basically want what git format-patch would generate if your commit log looks like other commit logs in git.

> +/* The following while() which copies tx-data into tx-fifo, must not be interrupted,
> +** interruption might lead to chip tx-fifo become empty before end of the transaction data,
> +** and cause chip-select release at mid transaction, which means SPI' master-slave protocol violation!
> +*/

Please wrap lines at less than 80 columns.

> +local_irq_save(flags);
>  while (max--) {
>  /* Set the tx word if the transfer's original "tx" is not null */
>  if (dws->tx_end - dws->len) {
> @@ -194,6 +200,7 @@ static void dw_writer(struct dw_spi *dws
>  dw_write_io_reg(dws, DW_SPI_DR, txw);
>  dws->tx += dws->n_bytes;
>  }
> +local_irq_restore(flags);

That's going to be a pretty long interrupts disabled period added unconditionally which is a bit rude for other users.  As Trent said the normal thing with hardware with half baked chip select control like this is to put the chip select into GPIO mode (or use a separate GPIO if there's not enough pinmux flexibility).  If we do decide to go down this road we should ensure that those users don't have to pay the cost of the interrupt disabled period and only disable the interrupts when using the native chip select.

You'll probably also run into trouble with RT usage as you can't actually disable interrupts there.
IMPORTANT - This email and any attachments is intended for the above named addressee(s), and may contain information which is confidential or privileged. If you are not the intended recipient, please inform the sender immediately and delete this email: you should not copy or use this e-mail for any purpose nor disclose its contents to any person.

Attachment: patch
Description: patch

Title:  Prevent spi-dw write transaction split
Setup:  Linux 4.9.76-rt  and 4.9.78-ltsi, using linux-socfpga(Cyclone-V)
Reported-by:   Gil Beniamini <GilB@xxxxxxxxx>
Signed-off-by: Gil Beniamini <GilB@xxxxxxxxx>

Without this patch SPI works fine for us ONLY in 99.999% of the time,
but not 100% (unless using this submitted patch)

At 1st write to dw_spi TX-FIFO starts xfer with "Active" ChipSelect,
But if during xfer TX-FIFO become empty => ChipSelect is "released" in mid
transfer => transaction split.

NOTE: Transaction split is not allowed!
For example if slave is HI-3220 TX-MSG expects format:
<op-code/register#><value#1>...<value#n>,
so that underflow after "i" elements, the write of next <value#i>
is understood as new 1st write <op-code/register#> instead of <data>,
and without our patch, we quickly saw Register#0 got corrupted!
Solution: The following copy to TX-FIFO must be ATOMIC==uninterrupted
==> disable Interrupts during this copy to TX-FIFO (bounded by FIFO size).

As for our real-world test-case scenario, our Linux machine receives UDP packets
at the rate of 100Hz, and creates SPI messages at the rate of 50Hz,
the SPI messages size varies from 5 bytes to 80 bytes.
When we use spi speed_hz is 10Mhz, once in 10 minutes SPI transaction intercept
(most likely by LAN processing) in mid of dw_writer() so that transaction split
happens.
At 1st as work around we reduced SPI speed_hz to 100khz and the problem rate
has gone down to once in few days.
After applying my suggested PATCH, it was tested in all speeds up to the
top speed of our HI-3220 SPI which is 40Mhz and it works fine for weeks
without any problems.
Moreover, we identified the same problem when this SPI driver is used to access
Flash, and it was solved by this same PATCH!

The-Patch:
--- linux-socfpga/drivers/spi/spi-dw.c.orig	2018-11-11 09:43:14.135466702 +0200
+++ linux-socfpga/drivers/spi/spi-dw.c	2018-11-19 15:12:02.351149000 +0200
@@ -182,7 +182,14 @@ static void dw_writer(struct dw_spi *dws
 {
 	u32 max = tx_max(dws);
 	u16 txw = 0;
+	unsigned long flags;
 
+	/* The following while() which copies tx-data into tx-fifo,
+	 * must not be interrupted, interruption might lead to chip tx-fifo become
+	 * empty before end of the transaction data, and cause chip-select release
+	 * at mid transaction, which means SPI' master-slave protocol violation!
+	 */
+	local_irq_save(flags);
 	while (max--) {
 		/* Set the tx word if the transfer's original "tx" is not null */
 		if (dws->tx_end - dws->len) {
@@ -194,6 +201,7 @@ static void dw_writer(struct dw_spi *dws
 		dw_write_io_reg(dws, DW_SPI_DR, txw);
 		dws->tx += dws->n_bytes;
 	}
+	local_irq_restore(flags);
 }
 
 static void dw_reader(struct dw_spi *dws)

[Index of Archives]     [Linux Kernel]     [Linux ARM (vger)]     [Linux ARM MSM]     [Linux Omap]     [Linux Arm]     [Linux Tegra]     [Fedora ARM]     [Linux for Samsung SOC]     [eCos]     [Linux Fastboot]     [Gcc Help]     [Git]     [DCCP]     [IETF Announce]     [Security]     [Linux MIPS]     [Yosemite Campsites]

  Powered by Linux