On Tue, 17 May 2022 14:43:59 +0100 Mark Brown <broonie@xxxxxxxxxx> wrote: > On Tue, May 17, 2022 at 03:09:06PM +0200, David Jander wrote: > > Mark Brown <broonie@xxxxxxxxxx> wrote: > > > On Tue, May 17, 2022 at 12:24:39PM +0200, David Jander wrote: > > > > > on, so I wonder whether there is something to gain if one could just call > > > > spi_bus_lock() at the start of several such small sync transfers and use > > > > non-locking calls (skip the queue lock and io_mutex)? Not sure that would have > > > > I do worry about how this might perform under different loads where > > > there are things coming in from more than one thread. > > > I understand your concern. The biggest issue is probably the fact that client > > drivers can claim exclusive access to the bus this way, and who knows if they > > take care to not claim it for too long? > > Yes, claiming the bus for a long time. > > > In the end, if multiple latency-sensitive client drivers share the same SPI > > bus, besides this being a questionable HW design, this will be asking for > > trouble. But I agree that usage of spi_bus_lock() by client drivers is > > probably not a good idea. > > I think the worst case would be mixing latency sensitive and throughput > sensitive devices, or possibly tasks on a device. As you say there's an > element of system design here. Sure. I have combined NOR-flash chips on the same SPI bus with an MCP2515 CAN controller in the past. But I knew that the NOR-flash chip would only ever be accessed during a firmware update or from the bootloader. Never together with CAN communication. If I did, I would have lost CAN messages guaranteed. So you can have compromises. I (as HW designer) in this case would never expect the kernel to try to make this work concurrently, and IMHO we (as kernel developers) shouldn't try in such extreme cases either. > > > > 1. hard-IRQ, queue msg --ctx--> SPI worker, call msg->complete() which does > > > > thread IRQ work (but can only do additional sync xfers from this context). > > > > > > > vs. > > > > > > > 2. hard-IRQ, queue msg --ctx--> SPI worker, call completion --ctx--> IRQ > > > > thread wait for completion and does more xfers... > > > > > > > vs (and this was my idea). > > > > > > > 3. hard-IRQ, pump FIFO (if available) --ctx--> IRQ thread, poll FIFO, do more > > > > sync xfers... > > > > > > Roughly 1, but with a lot of overlap with option 3. I'm unclear what > > > you mean by "queue message" here. > > > > In the above, I meant: > > > "queue message": > > list_add_tail(&msg->queue, &ctlr->queue); > > kthread_queue_work(ctlr->kworker, &ctlr->pump_messages); > > OK, no - I'm proposing actually putting the message onto the hardware > from interrupt context. Nice! I like that idea. Do you want to somehow extend spi_async() to do this transparently? So we just need to introduce a second function ("spi_async_await()" ?) which would wait for completion and collect the RX buffer? > > > Yes, that's the whole point. This also flows nicely when you've got a > > > queue since you can restart the hardware from the interrupt context > > > without waiting to complete the transfer that just finished. > > > Ack. Only caveat I see is the requirement for CS to be settable in a > > non-sleepable context. Not all GPIO pins have gpiod_set_value(). Some might > > only have gpiod_set_value_cansleep(). Although the latter case is not a good > > choice for a CS signal, so I doubt it can be found in the wild. > > Example: I2C GPIO expanders. In such a (very unlikely) case, the spi subsystem > > would need to fall back to the worker-queue, so probably not a big deal. > > Yes, there's a bunch of things where we have to fall back on a queue. > There's fully bitbanged controllers, excessively large messages or delay > requirements, clock reprogramming and other things. I like it. Thanks for the discussions so far. To sum up all possible patches you would accept if I understood correctly: 1. Make the stats/accounting code be NOP with a sysfs or similar toggle. 2. Enable the re-use of messages with once in lifetime prepare/map/validate. 3. Introduce spi_async_await() (or similar), to wait for completion of an async message. 4. Enable SPI drivers to tell the core (spi.c) under which conditions it can fire a message asynchronously without the need for the worker queue and implement support for those cases. Conditions involve max. transfer size, CS non-sleep access, etc... but it should probably be up to the SPI driver to decide I guess (ctlr->can_send_uninterruptible(msg)). Do I miss something? Minor concern about 4. above: Hopefully the decision can be made very quickly (i.e. without trying and failing). Maybe this decision result can be cached in the struct spi_message, so it can be re-used (see point 2)? Maybe as part of prepare or validate? I feel confident that these 4 modifications will have enough of a performance impact if fully exploited by the MCP2518FD driver, that overhead will no longer be a concern. Best regards, -- David Jander