On Wed, 8 Jun 2022 12:29:16 +0100 Mark Brown <broonie@xxxxxxxxxx> wrote: > On Wed, Jun 08, 2022 at 09:54:09AM +0200, David Jander wrote: > > Mark Brown <broonie@xxxxxxxxxx> wrote: > > > > Moving idling (and all the was_busy stuff) within the io_mutex would > > > definitely resolve the issue, the async submission context is the only one > > > that really needs the spinlock and it doesn't care about idling. I can't > > > think what you could do with the io_mutex when idling so it seems to > > > fit. > > > Ok, so we could agree on a way to fix this particular issue: put the idling > > transition into the io_mutex. Thanks. > > > Looking forward to read comments on the rest of the code, and the general idea > > of what I am trying to accomplish. > > I think the rest of it is fine or at least I'm finding it difficult to > see anything beyond the concurrency issues. I think we need to do an > audit to find any users that are doing a spi_sync() to complete a > sequence of spi_async() operations but I'm not aware of any and if it > delivers the performance benefits it's probably worth changing that > aspect of the driver API. I just discovered a different issue (hit upon by Oleksij Rempel while assisting with testing): Apparently some drivers tend to rely on the fact that master->cur_msg is not NULL and always points to the message being transferred. This could be a show-stopper to this patch set, if it cannot be solved. I am currently analyzing the different cases, to see if and how they could eventually get fixed. The crux of the issue is the fact that there are two different API's towards the driver: 1. transfer_one(): This call does not provide a reference to the message that contains the transfers. So all information stored only in the underlying spi_message are not accessible to the driver. Apparently some work around this by accessing master->cur_msg. 2. transfer_one_message(): I suspect this is a newer API. It takes the spi_message as argument, thus giving the driver access to all information it needs (like return status, and the complete list of transfers). One driver in particular spi-atmel.c, still used the deprecated master->cur_msg->is_dma_mapped flag. This is trivially fixed by replacing this with master->cur_msg_mapped, which is still available even in the sync path. It is still kinda ugly though, since it is a rather indirect way of obtaining information about the current transfer buffers it is handling. Some drivers look at master->cur_msg in interrupt mode if there was an error interrupt, to decide whether there is an ongoing transfer and sometimes also to store the error code in master->cur_msg->status. This could be solved by storing a reference to the current message in the private struct, but like in the other cases, there is no way to obtain that spi_message pointer from the transfer_one() call. In other words, there seem to be both a spi_transfer based API and a spi_message based API, but problems occur when the spi_transfer based calls need to know things about the underlying spi_message, which is sometimes artificially generated in functions like spi_sync_transfer(), so it always exists. Any suggestion which is the most desirable course of action? Try to fix the API inconsistency of wanting to access spi_message when all you asked for is a spi_transfer, try to work around it or just toss this patch, give up and call it a failed attempt because the impact is too big? Best regards, -- David Jander