Re: [PATCH v1 3/7] spi: qspi-tegra: Add support for Tegra210 QSPI controller

Mark Brown <broonie@xxxxxxxxxx> · Fri, 4 Dec 2020 18:52:23 +0000

On Thu, Dec 03, 2020 at 04:22:54PM -0800, Sowjanya Komatineni wrote:
> On 12/2/20 11:17 AM, Sowjanya Komatineni wrote:

> > > It seems weird that this device needs us to do a memcpy() to do DMA,
> > > most devices are able to DMA directly from the buffers provided by the
> > > SPI API (and let the SPI core sync things).  What is going on here?

> > For transfers of size more than max DMA transfer limit, data transfer
> > happens in multiple iterations with each iteration transferring up to
> > max DMA transfer limit.

> > So using separate dma buffers and on every iteration copying them to SPI
> > core provided tx/rx buffers.

I don't understand this - there's no restriction on where DMA transfers
can be done from within a DMA mapped region, the driver can do multiple
transfers from different chunks of the source buffer without having to
copy anything.  That's a very common scenario.

> Also unpack mode needs to manually put the bytes together from read data to
> SPI core rx buffer.

Could you be more explicit here, I don't know what "unpack mode" is?

> > > This is worrying, the client device might be confused if /CS is doing
> > > things outside of the standard handling.

> > Do you mean to honor spi_transfer cs_change flag?

At least, yes - more generally just if there's any feature to with chip
select then the driver will need to open code it.  The driver should at
least be double checking that what it's doing matches what it was told
to do, though just letting this be handled by the generic code if
there's no limitation on the hardware tends to be easier all round.

> > Tegra QSPI is master and is used only with QSPI flash devices. Looking
> > at SPI NOR driver, I see QSPI Flash commands are executed with one flash
> > command per spi_message and I dont see cs_change flag usage w.r.t QSPI
> > flash. So, using SW based CS control for QSPI.

> > Please correct me if I miss something to understand here.

Someone might build a system that does something different, they may see
a spare SPI controller and decide they can use it for something else or
there may be some future change with the flash code which does something
different.

> > > > +    tegra_qspi_writel(tqspi, tqspi->def_command2_reg, QSPI_COMMAND2);

> > > The setup for one device shouldn't be able to affect the operation of
> > > another, already running, device so either these need to be configured
> > > as part of the controller probe or these configurations need to be
> > > deferred until we're actually doing a transfer.

> > We will only have 1 device on QSPI as we only support single chip select.

It's quite common for people to do things like add additional devices
with GPIO chip selects.

> > > > +    /*
> > > > +     * Tegra QSPI hardware support dummy bytes transfer based on the
> > > > +     * programmed dummy clock cyles in QSPI register.
> > > > +     * So, get the total dummy bytes from the dummy bytes transfer in
> > > > +     * spi_messages and convert to dummy clock cyles.
> > > > +     */
> > > > +    list_for_each_entry(xfer, &msg->transfers, transfer_list) {
> > > > +        if (ntransfers == DUMMY_BYTES_XFER &&
> > > > +            !(list_is_last(&xfer->transfer_list, &msg->transfers)))
> > > > +            dummy_cycles = xfer->len * 8 / xfer->tx_nbits;
> > > > +        ntransfers++;
> > > > +    }

> > > This seems weird, there's some hard coded assumption about particular
> > > patterns that the client device is going to send.  What's going on here?
> > > I don't really understand what this is trying to do.

> > QSPI flash needs dummy cycles for data read operation which is actually
> > the initial read latency and no. of dummy cycles required are vendor
> > specific.

> > SPI NOR driver gets required dummy cycles based on mode clock cycles and
> > wait state clock cycles.

> > During read operations, spi_nor_spimem_read_data() converts dummy cycles
> > to number of dummy bytes.

> > Tegra QSPI controller supports dummy clock cycles register and when
> > programmed QSPI controller sends dummy bytes rather than SW handling
> > extra cycles for transferring dummy bytes.

> > Above equation converts this dummy bytes back to dummy clock cycles to
> > program into QSPI register and avoid manual SW transfer of dummy bytes.

This is not a good idea, attempting to reverse engineer the message and
guess at the contents isn't going to be robust and if it's useful it
will if nothing else lead to a bunch of duplicated code in drivers as
every device that has this feature will need to reimplment it.  Instead
we should extend the framework so there's explicit support for
specifying transfers that are padding bytes, then there's no guesswork
that can go wrong and no duplicated code between drivers.  A flag in the
transfer struct might work?
Attachment:
signature.asc

Description: PGP signature