Re: [PATCH] spi: atmel-quadspi: Add support for sama7g5 QSPI

Michael Walle <michael@xxxxxxxx> · Wed, 13 Apr 2022 14:50:32 +0200

Hi Tudor,

> The sama7g5 QSPI controller uses dedicated clocks for the
> QSPI Controller Interface and the QSPI Controller Core, and
> requires synchronization before accessing registers or bit
> fields.
> 
> QSPI_SR.SYNCBSY must be zero before accessing any of the bits:
> QSPI_CR.QSPIEN, QSPI_CR.QSPIDIS, QSPI_CR.SRFRSH, QSPI_CR.SWRST,
> QSPI_CR.UPDCFG, QSPI_CR.STTFR, QSPI_CR.RTOUT, QSPI_CR.LASTXFER.
> 
> Also, the QSPI controller core configuration can be updated by
> writing the QSPI_CR.UPDCFG bit to ‘1’. This is needed by the
> following registers: QSPI_MR, QSPI_SCR, QSPI_IAR, QSPI_WICR,
> QSPI_IFR, QSPI_RICR, QSPI_SMR, QSPI_SKR,QSPI_REFRESH, QSPI_WRACNT
> QSPI_PCALCFG.
> 
> The Octal SPI supports frequencies up to 200 MHZ DDR. The need
> for output impedance calibration arises. To avoid the degradation
> of the signal quality, a PAD calibration cell is used to adjust
> the output impedance to the driven I/Os.
> 
> The transmission flow requires different sequences for setting
> the configuration and for the actual transfer, than what is in
> the sama5d2 and sam9x60 versions of the IP. Different interrupts
> are handled. aq->ops->set_cfg() and aq->ops->transfer() are
> introduced to help differentiating the flows.
> 
> Tested single and octal SPI mode with mx66lm1g45g.

I've successfully tested this on a LAN9668 with a SST25VF016B
and 104 MHz (quad mode). But there are some catches:

(1) SPI mode is not set, i.e. SCR.CPHA, SCR.CPOL

(2) There is no (or a really short delay) between asserting
    the chip select and the first clock edge. I.e. SCR.DLYBS
    is zero. I wasn't able to go faster than ~20MHz with that.
    Also the slightest capacitance, like a probe tip, made things
    worse. I've been successful with a value of 2 at 104MHz,
    although attaching an oscilloscope probe (<4 pF input
    capacitance, no cheapo probe) made things unreliable again.
    In the end a value of 4 worked perfectly. I think it is
    overkill to make this configurable, the added delay should
    be negligible.

(3) As already discussed on IRC, the driver will iomap the
    whole memory window which is 256MB for one controller
    in my case. On arm32 the vmalloc area is only 240MB by
    default. The lan9668 has three of these controllers
	(whereas one only has an 8MB window). Therefore, we would
    potentially waste 520MB just for the SPI windows.

    I had a look at the driver, although IAR is set, it is
    not used for the accesses through the memory window. doh ;)
    It seems we need to map the memory just for the memcpy_io.
    The DMA engine should be happy with the physical addresses
    and shouldn't need the iomap. What do you think about just
    mapping like 16MB and after that always fall back to DMA
    regardless of the transfer size.

    In fact I don't know why that memory window is needed at all.
    Shouldn't the DMA engine be able to just read from RDR and
    write to TDR? And PIO mode could do the same.

(4) Odd transfer lengths doesn't work. That is I get different
    results for the folllwing:
    (a) dd if=/dev/mtd0 bs=3 | hexdump -C | head
    (a) dd if=/dev/mtd0 bs=4 | hexdump -C | head

    Actually, I've notived that using the (busybox) hexdump
    directly on /dev/mtd0 returned some really odd bytes. Might
    or might not be related to that above.

-michael