Re: MCP25xxFD Driver Error (-47)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 9/17/20 2:23 AM, Kirby Nankivell wrote:
>     What about the spi0 node?

> 
> 		spi0: spi@1c68000 {
> 			#address-cells = <2>;
> 			#size-cells = <0>;
> 			compatible = "allwinner,sun8i-h3-spi";
> 			reg = <0x01c68000 0x1000>;
> 			interrupts = <GIC_SPI 65 IRQ_TYPE_LEVEL_HIGH>;
> 			clocks = <&ccu CLK_BUS_SPI0>, <&ccu CLK_SPI0>;
> 			clock-names = "ahb", "mod";
> 			dmas = <&dma 23>, <&dma 23>;
> 			dma-names = "rx", "tx";
> 			//num-cs = 2
> 			pinctrl-names = "default", "default";
> 			pinctrl-0 = <&spi0_pins>;
> 			pinctrl-1 = <&spi0_cs1>;

Please re-read my previous emails. This doesn't look correct, change this to:

pinctrl-names = "default";
pinctrl-0 = <&spi0_pins &spi0_cs1>;

> 			cs-gpios = <0>, <&pio 4 21 0>; /* PE21 */
> 			resets = <&ccu RST_BUS_SPI0>;
> 			status = "disabled";
> 
> 		};

> I'm not sure where the Clock registers are generated - the CCU header doesn't
> make sense to me?

I don't understand what you mean by this.

>     > The clock frequency was chosen on your prior advice: being less than 50%
>     of the
>     > controller clock speed (10Mhz), and a factor of 600/2x as limited by the
>     > Allwinner SPI peripheral, in this case; 600 / (2*34).
>     In my H3 DT, I configure the SPI core to 600 MHz, not sure if the V3s
>     supports that.
> 
>     The driver will use the DT max frequency as an upper bound. If you use a recent
>     kernel (v5.7 or newer) or have included my patch:
> 
>         spi: spi-sun6i: sun6i_spi_transfer_one():
>              fix setting of clock rate
> 
>     the spi host driver will pick up a proper clock rate.
> 
> Understood. I have been using the driver from your branch, but you say this is
> fixed in 5.7?

The SPI host driver has a bug prior to v5.7, where it selects a higher clock
rate then requested.

>     >     [ 1.255123] CAN device driver interface
>     >     [ 1.259309] spi_master spi0: will run message pump with realtime priority
>     >     [ 1.304566] mcp25xxfd spi0.1 can0: MCP2518FD rev0.0 (-RX_INT -MAB_NO_WARN
>     >     +CRC_REG +CRC_RX +CRC_TX +ECC -HD m:8.82MHz r:8.82MHz e:8.33MHz)
>     >     successfully initialized.
> 
>     "m:8.82MHz r:8.82MHz e:8.33MHz"
> 
>     m = maximum as defined by DT
>     r = requested by driver
>     e = effective speed used by the host driver
> 
> Understood, so my clock divider looks wrong, do you know how I can determine my
> peripheral bus speed?

Why do you think your clock divider looks wrong? The speed of the peripheral is
shown at "e", so it's 8.33 MHz in your case. It's probably 600 MHz / 72.

However, if you want to be sure, use a scope :)

>     > However that was short lived; I couldn't receive a single message (candump
>     500k)
>     > without getting a crc error:
>     >
>     >     [ 48.469759] mcp25xxfd spi0.1 can0: CRC read error at address 0x001c
>     >     (length=4, data=00 00 1a 3f, CRC=0x1e7c).
>     >     [ 48.479730] mcp25xxfd spi0.1 can0: IRQ handler returned -74
>     (intf=0x3f1a0002).
> 
>     Can you compile the mcp25xxfd-regmap.c with adding a:
> 
>         #define DEBUG
> 
>     prior to any of the #include statements. That should give some more debugging
>     output with enabled CRC mode. Please post that here.

Can you please recompile the drive with "#define DEBUG" added to the
mcp25xxfd-regmap.c? As requested above?

>     Can you remove the display from the SPI and test again.
>     Can you try to limit the spi speed via DT even more, e.g. to 5MHz?
> 
> I made a test image, max speed was configured to 5.0MHz, I tried a variety of
> SPI drivers - But I could not get the controller to initialise at this speed - I
> kept getting CRC errors in the boot dmesg.

With or without the display?

> I reverted and tried two different clock speeds:
> 
> Standard hardware test - 8.823529 MHz, your SPI driver, LCD connected:

> # ip link set can0 up type can bitrate 250000 restart-ms 100 fd off
> # candump can0
>   can0  07B   [4]  04 04 00 00
>   can0  07B   [4]  04 04 00 00
>   can0  07B   [4]  04 04 00 00
>   can0  07B   [4]  04 04 00 00
>   can0  07B   [4]  04 04 00 00
>   can0  07B   [4]  04 04 00 00
>   can0  07B   [4]  04 04 00 00
>   can0  2AD   [3]  00 00 00
>   can0  542   [8]  01 00 00 00 00 00 00 00
>   can0  475   [3]  02 00 00
> 
> [   74.528042] mcp25xxfd spi0.1 can0: CRC read error at address 0x05a8 (length=20, data=f0 02 00 00 01 00 00 00 a7 e8 9e 38 03 07 fe 00 00 00 00 00, CRC=0xdac2).
> [   74.542264] mcp25xxfd spi0.1 can0: IRQ handler mcp25xxfd_handle_rxif() returned -74.
> [   74.550001] mcp25xxfd spi0.1 can0: IRQ handler returned -74 (intf=0x3f1a0002).

So CRC error with display attached.

> _I decided then to keep the LCD driver builtin but physically disconnect the FPC
> for the next tests._

Can you try with the display attached, but with the display driver not loaded?

> _20 mhz (using 5.7 standard spi)_

As your mcp25xxfd is clocked with 20 MHz, the SPI speed is limited to 10 MHz
(per datasheet). Tests have shown, that the chip is not stable so the speed is
further reduced to 92.5% (in older versions of my driver) or even 85% in the
newest iteration.

> [    1.189727] spi_master spi0: will run message pump with realtime priority
> [    1.234920] mcp25xxfd spi0.1 can0: MCP2518FD rev0.0 (-RX_INT -MAB_NO_WARN +CRC_REG +CRC_RX +CRC_TX +ECC -HD m:20.00MHz r:9.25MHz e:0.00MHz) successfully initialized.

Here you see the mcp25xxfd driver requests ("r:") 9.25 MHz, that's 92.5% of 10 MHz.

So without the display it works?

> then at 500K load test:
> 
> /[   17.568035] mcp25xxfd spi0.1 can0: RX-0: FIFO overflow./
> /[   17.577187] mcp25xxfd spi0.1 can0: RX-0: FIFO overflow./
> /[   17.588286] mcp25xxfd spi0.1 can0: RX-0: FIFO overflow./

This can happen, due to too high load. To fix this, we have to optimize the
V3s's SPI controller driver and/or the mcp25xxfd driver.

> _20 mhz, using your spi driver branch_
> 
> /[    1.259449] spi_master spi0: will run message pump with realtime priority/
> /[    1.304714] mcp25xxfd spi0.1 can0: MCP2518FD rev0.0 (-RX_INT -MAB_NO_WARN
> +CRC_REG +CRC_RX +CRC_TX +ECC -HD m:20.00MHz r:9.25MHz e:8.33MHz) successfully
> initialized./

See above, just the mcp25xxfd driver requests ("r:") 9.25MHz, which results in
effective ("e:") 9.09 MHz (600 MHz / 66).

> then load testing at 500k:
> 
> /[   71.272776] mcp25xxfd spi0.1 can0: RX-0: FIFO overflow./
> /[   71.283761] mcp25xxfd spi0.1 can0: RX-0: FIFO overflow./
> /[   71.295123] mcp25xxfd spi0.1 can0: RX-0: FIFO overflow./


> _8.823529 MHz – using your spi driver_
> 
> /[    1.259338] spi_master spi0: will run message pump with realtime priority/
> 
> /[    1.304622] mcp25xxfd spi0.1 can0: MCP2518FD rev0.0 (-RX_INT -MAB_NO_WARN
> +CRC_REG +CRC_RX +CRC_TX +ECC -HD m:8.82MHz r:8.82MHz e:8.33MHz) successfully
> initialized./

... which results ("e:") in 8.33MHz

> *NO RX issues*

looks good.

> _8.823529 MHz – using 5.7 mainline spi driver
> 
> /[    1.259402] spi_master spi0: will run message pump with realtime priority/
> /[    1.304605] mcp25xxfd spi0.1 can0: MCP2518FD rev0.0 (-RX_INT -MAB_NO_WARN
> +CRC_REG +CRC_RX +CRC_TX +ECC -HD m:8.82MHz r:8.82MHz e:0.00MHz) successfully
> initialized./

> //then load testing at 500k://
> 
> /[   33.235868] mcp25xxfd spi0.1 can0: RX-0: FIFO overflow./
> /[   33.246714] mcp25xxfd spi0.1 can0: RX-0: FIFO overflow./
> /[   33.257608] mcp25xxfd spi0.1 can0: RX-0: FIFO overflow./
> 
> The results were clear, there is some SPI issue that prevents me
> specifying above 8.8Mhz, even if the effective speed is detected to be the same.

Probably interger arithmetics :)

600 MHz / 8.823529 Mhz

600000000 / 8823529 = 68.0000031733 -> 69

The SPI host driver only supports even dividers.

-> 70

600 MHz / 70 = 8.57 MHz

but your SPI host controller selects 8.33MHz...

Maybe your V3s has a different clock tree than the H3.

please post the output of:

> grep . /sys/kernel/debug/clk/spi0/clk_{parent,possible_parents,rate}
> grep . $(for i in $(cat /sys/kernel/debug/clk/spi0/clk_possible_parents); do echo /sys/kernel/debug/clk/$i/clk_rate;done)

> Likewise lower speeds (5MHz would not enable).
> Your allwinner spi branch at this speed is stable.

> Additionally it was clear then that the LCD was affecting the SPI traffic from
> the controller and likely the culprit of the CRC errors, I suspected that the
> MISO pin on the LCD was not being properly tri-state when the CS for the display
> was not being asserted.
> And this was causing some noise, bit flipping or similar on the bus. It was also
> likely in this case that I would probably see packet errors during operation if
> CRC was ignored.

So better keep CRC enabled!

> Luckily the LCD supports open ended mode and does not need MISO (DO) connected
> to function, also luckily is that I put zero ohm resistors on the board before
> the 0.35mm FPC connector. I removed the resistor to the LCD MISO pin and
> reconfigured for open-ended mode.

\o/

> I can now successfully load test at high rates without packet drop or failure
> and the LCD works too!

\o/

> image.png
> image.png
> image.png
> 
> This is a pretty good outcome, thanks for your help! The SPI driver is a bit
> odd, I might try with a 40MHz XTAL and see what changes.

With a 40MHz Xtal this you can increase the SPI clock to 40/2*85% = 17Mhz, which
should result in 16.67 MHz (given 600 MHz SPI parent clock).

> Please let me know if I can run any more tests to help!

See above, let's have a look at your clock tree.

Marc
-- 
Pengutronix e.K.                 | Marc Kleine-Budde           |
Embedded Linux                   | https://www.pengutronix.de  |
Vertretung West/Dortmund         | Phone: +49-231-2826-924     |
Amtsgericht Hildesheim, HRA 2686 | Fax:   +49-5121-206917-5555 |

Attachment: signature.asc
Description: OpenPGP digital signature


[Index of Archives]     [Automotive Discussions]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]     [CAN Bus]

  Powered by Linux