On Mon, Aug 29, 2022 at 10:56:13AM +0200, David Jander wrote: > spi_mux_transfer_one_message() returns before the message is transferred (in > spi_async()), which is not expected. AFAIK, an ctlr->transfer_one_message() > implementation should not return until the transfer is completed. That's because what it wants to happen here is that the controller then runs the message that it's being asked to perform, it'll then get a callback when the message completes which it'll use to deselect the device and then complete the original callback. This is a horrible hack. It should be fine for transfer_one_message() to return immediately, it's called from __spi_pump_transfer_message() which will wait for the message to be finalized. However I do note that if we get a message going in via the sync path skipping the queue we set msg->sync, among other things, and then the mux will try to reuse the same message and resubmit it as an async message with the sync flag set which I can't see is going to go well. > Not sure if this is a correct fix, but I'd like to know if your situation > changes this way, if you could try it. > I don't have access to any hardware with a mux unfortunately, so I can't test > it myself. I guess claiming to have a noop mux might work for testing, though I'd be dubious that it was actually doing the mux operations properly? I think we need to either change spi_mux to duplicate the incoming message (that's probably "cleaner") or teach the core that spi-mux exists and should always use the pump/queue. The below might do the trick but in spite of my suggestion above I've not tested either yet: diff --git a/drivers/spi/spi-mux.c b/drivers/spi/spi-mux.c index f5d32ec4634e..0709e987bd5a 100644 --- a/drivers/spi/spi-mux.c +++ b/drivers/spi/spi-mux.c @@ -161,6 +161,7 @@ static int spi_mux_probe(struct spi_device *spi) ctlr->num_chipselect = mux_control_states(priv->mux); ctlr->bus_num = -1; ctlr->dev.of_node = spi->dev.of_node; + ctlr->must_async = true; ret = devm_spi_register_controller(&spi->dev, ctlr); if (ret) diff --git a/drivers/spi/spi.c b/drivers/spi/spi.c index 1cfed874f7ae..88d48a105d3c 100644 --- a/drivers/spi/spi.c +++ b/drivers/spi/spi.c @@ -4033,7 +4033,7 @@ static int __spi_sync(struct spi_device *spi, struct spi_message *message) * guard against reentrancy from a different context. The io_mutex * will catch those cases. */ - if (READ_ONCE(ctlr->queue_empty)) { + if (READ_ONCE(ctlr->queue_empty) && !ctlr->must_async) { message->actual_length = 0; message->status = -EINPROGRESS; diff --git a/include/linux/spi/spi.h b/include/linux/spi/spi.h index e6c73d5ff1a8..f089ee1ead58 100644 --- a/include/linux/spi/spi.h +++ b/include/linux/spi/spi.h @@ -469,6 +469,7 @@ extern struct spi_device *spi_new_ancillary_device(struct spi_device *spi, u8 ch * SPI_TRANS_FAIL_NO_START. * @queue_empty: signal green light for opportunistically skipping the queue * for spi_sync transfers. + * @must_async: disable all fast paths in the core * * Each SPI controller can communicate with one or more @spi_device * children. These make a small bus, sharing MOSI, MISO and SCK signals @@ -690,6 +691,7 @@ struct spi_controller { /* Flag for enabling opportunistic skipping of the queue in spi_sync */ bool queue_empty; + bool must_async; }; static inline void *spi_controller_get_devdata(struct spi_controller *ctlr) Assuming that works out there'll be an extra test in the fast path but no sync operations, and a performance hit for spi-mux users. Hopefully that works as well as it did before.
Attachment:
signature.asc
Description: PGP signature