On 10/03/17 00:49, Linus Walleij wrote: > On Wed, Feb 22, 2017 at 2:29 PM, Adrian Hunter <adrian.hunter@xxxxxxxxx> wrote: >> On 09/02/17 17:33, Linus Walleij wrote: >>> The waitqueue in the host context is there to signal back from >>> mmc_request_done() through mmc_wait_data_done() that the hardware >>> is done with a command, and when the wait is over, the core >>> will typically submit the next asynchronous request that is pending >>> just waiting for the hardware to be available. >>> >>> This is in the way for letting the mmc_request_done() trigger the >>> report up to the block layer that a block request is finished. >>> >>> Re-jig this as a first step, remvoving the waitqueue and introducing >>> a work that will run after a completed asynchronous request, >>> finalizing that request, including retransmissions, and eventually >>> reporting back with a completion and a status code to the >>> asynchronous issue method. >>> >>> This had the upside that we can remove the MMC_BLK_NEW_REQUEST >>> status code and the "new_request" state in the request queue >>> that is only there to make the state machine spin out >>> the first time we send a request. >>> >>> Introduce a workqueue in the host for handling just this, and >>> then a work and completion in the asynchronous request to deal >>> with this mechanism. >>> >>> This is a central change that let us do many other changes since >>> we have broken the submit and complete code paths in two, and we >>> can potentially remove the NULL flushing of the asynchronous >>> pipeline and report block requests as finished directly from >>> the worker. >> >> This needs more thought. The completion should go straight to the mmc block >> driver from the ->done() callback. And from there straight back to the >> block layer if recovery is not needed. We want to stop using >> mmc_start_areq() altogether because we never want to wait - we always want >> to issue (if possible) and return. > > I don't quite follow this. Isn't what you request exactly what > patch 15/16 "mmc: queue: issue requests in massive parallel" > is doing? There is the latency for the worker that runs mmc_finalize_areq() and then another latency to wake up the worker that is running mmc_start_areq(). That is 2 wake-ups instead of 1. As a side note, ideally we would be able to issue the next request from the interrupt or soft interrupt context of the completion (i.e. 0 wake-ups between requests), but we would probably have to look at the host API to support that. > > The whole patch series leads up to that. > >> The core API to use is __mmc_start_req() but the block driver should >> populate mrq->done with its own handler. i.e. change __mmc_start_req() >> >> - mrq->done = mmc_wait_done; >> + if (!mrq->done) >> + mrq->done = mmc_wait_done; >> >> mrq->done() would complete the request (e.g. via blk_complete_request()) if >> it has no errors (and doesn't need polling), and wake up the queue thread to >> finish up everything else and start the next request. > > I think this is what it does at the end of the patch series, patch 15/16. > I have to split it somehow... > >> For the blk-mq port, the queue thread should also be retained, partly >> because it solves some synchronization problems, but mostly because, at this >> stage, we anyway don't have solutions for all the different ways the driver >> can block. >> (as listed here https://marc.info/?l=linux-mmc&m=148336571720463&w=2 ) > > Essentially I take out that thread and replace it with this one worker > introduced in this very patch. I agree the driver can block in many ways > and that is why I need to have it running in process context, and this > is what the worker introduced here provides. The last time I looked at the blk-mq I/O scheduler code, it pulled up to qdepth requests from the I/O scheduler and left them on a local list while running ->queue_rq(). That means blocking in ->queue_rq() leaves some number of requests in limbo (not issued but also not in the I/O scheduler) for that time. Maybe blk-mq should offer a pull interface to I/O scheduler users?