Re: [RFC] A new SPI API for fast, low-latency regmap peripheral access

Mark Brown <broonie@xxxxxxxxxx> · Tue, 17 May 2022 12:57:18 +0100

On Tue, May 17, 2022 at 12:24:39PM +0200, David Jander wrote:

> (mainly in spi.c for now). Time interrupt line stays low:

>  1. Kernel 5.18-rc1 with only polling patches from spi-next: 135us

>  2. #if 0 around all stats and accounting calls: 100us

>  3. The _fast API of my original RFC: 55us

> This shows that the accounting code is a bit less than half of the dispensable
> overhead for my use-case. Indeed an easy target.

Good.

> on, so I wonder whether there is something to gain if one could just call
> spi_bus_lock() at the start of several such small sync transfers and use
> non-locking calls (skip the queue lock and io_mutex)? Not sure that would have
> a meaningful impact, but to get an idea, I replaced the bus_lock_spinlock and
> queue_lock in __spi_sync() and __spi_queued_transfer() with the bare code in
> __spi_queued_transfer(), since it won't submit work to the queue in this case
> anyway. The resulting interrupt-active time decreased by another 4us, which is
> approximately 5% of the dispensable overhead. For the record, that's 2us per
> spinlock lock/unlock pair.

I do worry about how this might perform under different loads where
there are things coming in from more than one thread.

> > One thing that might be useful would be if we could start the initial
> > status read message from within the hard interrupt handler of the client
> > driver with the goal that by the time it's threaded interrupt handler
> > runs we might have the data available.  That could go wrong on a lightly
> > loaded system where we might end up running the threaded handler while
> > the transfer is still running, OTOH if it's lightly loaded that might
> > not matter.  Or perhaps just use a completion from the SPI operation and
> > not bother with the threaded handler at all.

> You mean ("ctx" == context switch):

>  1. hard-IRQ, queue msg --ctx--> SPI worker, call msg->complete() which does
>  thread IRQ work (but can only do additional sync xfers from this context).

> vs.

>  2. hard-IRQ, queue msg --ctx--> SPI worker, call completion --ctx--> IRQ
>  thread wait for completion and does more xfers...

> vs (and this was my idea).

>  3. hard-IRQ, pump FIFO (if available) --ctx--> IRQ thread, poll FIFO, do more
>  sync xfers...

Roughly 1, but with a lot of overlap with option 3.  I'm unclear what
you mean by "queue message" here.

> Option 3 would require a separation of spi_sync_transfer into two halves. One
> half just activates CS (non-sleep GPIO api!) and fills the FIFO. The second
> half polls the FIFO for transfer completion. This path could only be chosen if
> the SPI controller has a FIFO that can hold the whole message. In other words a
> lot of spacial case handling for what it is worth probably... but still
> interesting.

Yes, that's the whole point.  This also flows nicely when you've got a
queue since you can restart the hardware from the interrupt context
without waiting to complete the transfer that just finished.

> Option 2 is probably not that bad if the SPI worker can run on another core?

Pretty much anything benefits with another core.
Attachment:
signature.asc

Description: PGP signature