On Tue, Feb 11, 2020 at 5:45 PM Ulf Hansson <ulf.hansson@xxxxxxxxxx> wrote: > > On Wed, 5 Feb 2020 at 13:51, Baolin Wang <baolin.wang7@xxxxxxxxx> wrote: > > > > From: Baolin Wang <baolin.wang@xxxxxxxxxx> > > > > Now the MMC read/write stack will always wait for previous request is > > completed by mmc_blk_rw_wait(), before sending a new request to hardware, > > or queue a work to complete request, that will bring context switching > > overhead, especially for high I/O per second rates, to affect the IO > > performance. > > Would you mind adding some more context about the mmc_blk_rw_wait()? > Especially I want to make it clear that mmc_blk_rw_wait() is also used > to poll the card for busy completion for I/O writes, via sending > CMD13. Sure. > > > > > Thus this patch introduces MMC software queue interface based on the > > hardware command queue engine's interfaces, which is similar with the > > hardware command queue engine's idea, that can remove the context > > switching. Moreover we set the default queue depth as 64 for software > > queue, which allows more requests to be prepared, merged and inserted > > into IO scheduler to improve performance, but we only allow 2 requests > > in flight, that is enough to let the irq handler always trigger the > > next request without a context switch, as well as avoiding a long latency. > > I think it's important to clarify that to use this new interface, hsq, > the host controller/driver needs to support HW busy detection for I/O > operations. > > In other words, the host driver must not complete a data transfer > request, until after the card stops signals busy. This behaviour is > also required for "closed-ended-transmissions" with CMD23, as in this > path there is no CMD12 sent to complete the transfer, thus no R1B > response flag to trigger the HW busy detection behaviour in the > driver. Sure. > > > > > From the fio testing data in cover letter, we can see the software > > queue can improve some performance with 4K block size, increasing > > about 16% for random read, increasing about 90% for random write, > > though no obvious improvement for sequential read and write. > > > > Moreover we can expand the software queue interface to support MMC > > packed request or packed command in future. > > > > Reviewed-by: Arnd Bergmann <arnd@xxxxxxxx> > > Signed-off-by: Baolin Wang <baolin.wang@xxxxxxxxxx> > > Signed-off-by: Baolin Wang <baolin.wang7@xxxxxxxxx> > > --- > > [...] > > > diff --git a/drivers/mmc/core/mmc.c b/drivers/mmc/core/mmc.c > > index f6912de..7a9976f 100644 > > --- a/drivers/mmc/core/mmc.c > > +++ b/drivers/mmc/core/mmc.c > > @@ -1851,15 +1851,22 @@ static int mmc_init_card(struct mmc_host *host, u32 ocr, > > */ > > card->reenable_cmdq = card->ext_csd.cmdq_en; > > > > - if (card->ext_csd.cmdq_en && !host->cqe_enabled) { > > + if (host->cqe_ops && !host->cqe_enabled) { > > err = host->cqe_ops->cqe_enable(host, card); > > if (err) { > > pr_err("%s: Failed to enable CQE, error %d\n", > > mmc_hostname(host), err); > > This means we are going to start printing an error message for those > eMMCs that doesn't support command queuing, but the host supports > MMC_CAP2_CQE. > > Not sure how big of a problem this is, but another option is simply to > leave the logging of the *failures* to the host driver, rather than > doing it here. > > Oh well, feel free to change or leave this as is for now. We can > always change it on top, if needed. OK. I will move the failure log to cqe_enable() callback to keep the same logs' logic. Thanks. > > } else { > > host->cqe_enabled = true; > > - pr_info("%s: Command Queue Engine enabled\n", > > - mmc_hostname(host)); > > + > > + if (card->ext_csd.cmdq_en) { > > + pr_info("%s: Command Queue Engine enabled\n", > > + mmc_hostname(host)); > > + } else { > > + host->hsq_enabled = true; > > + pr_info("%s: Host Software Queue enabled\n", > > + mmc_hostname(host)); > > + } > > } > > } > > [...] > > Kind regards > Uffe