On 15 November 2017 at 14:07, Adrian Hunter <adrian.hunter@xxxxxxxxx> wrote: > On 15/11/17 12:55, Ulf Hansson wrote: >> Linus, Adrian, >> >> Apologize for sidetracking the discussion, just wanted to add some >> minor comments. >> >> [...] >> >>> >>>>> But what I think is nice in doing it around >>>>> each request is that since mmc_put_card() calls mmc_release_host() >>>>> contains this: >>>>> >>>>> if (--host->claim_cnt) { (...) >>>>> >>>>> So there is a counter inside the claim/release host functions >>>>> and now there is another counter keeping track if the in-flight >>>>> requests as well and it gives me the feeling we are maintaining >>>>> two counters when we only need one. >>>> >>>> The gets / puts also get runtime pm for the card. It is a lot a messing >>>> around for the sake of a quick check for the number of requests inflight - >>>> which is needed anyway for CQE. >> >> Actually the get / puts for runtime PM of the card is already done >> taking the host->claim_cnt into account. > > We do pm_runtime_get_sync(&card->dev) always i.e. > > void mmc_get_card(struct mmc_card *card, struct mmc_ctx *ctx) > { > pm_runtime_get_sync(&card->dev); > __mmc_claim_host(card->host, ctx, NULL); > } You are absolutely correct, so let's leave the inflight counter in. Apologize for the noise! I was thinking about the runtime PM of the host dev, which is managed by mmc_claim|release host(). [...] >>>> Well I saw 3% drop in sequential read performance, improving to 1% when an >>>> unbound workqueue was used. >> >> Can you please make this improvement as a standalone patch on top of >> the mq patch? > > Unfortunately it was just a hack to the blk-mq core - the block layer does > not have an unbound workqueue. I have not had time to consider a proper > fix. It will have to wait, but I agree 3% is not enough to delay going forward. I see, thanks! [...] Kind regards Uffe