On Tue, May 17, 2022 at 12:24:39PM +0200, David Jander wrote: > (mainly in spi.c for now). Time interrupt line stays low: > 1. Kernel 5.18-rc1 with only polling patches from spi-next: 135us > 2. #if 0 around all stats and accounting calls: 100us > 3. The _fast API of my original RFC: 55us > This shows that the accounting code is a bit less than half of the dispensable > overhead for my use-case. Indeed an easy target. Good. > on, so I wonder whether there is something to gain if one could just call > spi_bus_lock() at the start of several such small sync transfers and use > non-locking calls (skip the queue lock and io_mutex)? Not sure that would have > a meaningful impact, but to get an idea, I replaced the bus_lock_spinlock and > queue_lock in __spi_sync() and __spi_queued_transfer() with the bare code in > __spi_queued_transfer(), since it won't submit work to the queue in this case > anyway. The resulting interrupt-active time decreased by another 4us, which is > approximately 5% of the dispensable overhead. For the record, that's 2us per > spinlock lock/unlock pair. I do worry about how this might perform under different loads where there are things coming in from more than one thread. > > One thing that might be useful would be if we could start the initial > > status read message from within the hard interrupt handler of the client > > driver with the goal that by the time it's threaded interrupt handler > > runs we might have the data available. That could go wrong on a lightly > > loaded system where we might end up running the threaded handler while > > the transfer is still running, OTOH if it's lightly loaded that might > > not matter. Or perhaps just use a completion from the SPI operation and > > not bother with the threaded handler at all. > You mean ("ctx" == context switch): > 1. hard-IRQ, queue msg --ctx--> SPI worker, call msg->complete() which does > thread IRQ work (but can only do additional sync xfers from this context). > vs. > 2. hard-IRQ, queue msg --ctx--> SPI worker, call completion --ctx--> IRQ > thread wait for completion and does more xfers... > vs (and this was my idea). > 3. hard-IRQ, pump FIFO (if available) --ctx--> IRQ thread, poll FIFO, do more > sync xfers... Roughly 1, but with a lot of overlap with option 3. I'm unclear what you mean by "queue message" here. > Option 3 would require a separation of spi_sync_transfer into two halves. One > half just activates CS (non-sleep GPIO api!) and fills the FIFO. The second > half polls the FIFO for transfer completion. This path could only be chosen if > the SPI controller has a FIFO that can hold the whole message. In other words a > lot of spacial case handling for what it is worth probably... but still > interesting. Yes, that's the whole point. This also flows nicely when you've got a queue since you can restart the hardware from the interrupt context without waiting to complete the transfer that just finished. > Option 2 is probably not that bad if the SPI worker can run on another core? Pretty much anything benefits with another core.
Attachment:
signature.asc
Description: PGP signature