On 9/06/24 21:40, Victor Shih wrote: > On Fri, May 31, 2024 at 7:23 PM Adrian Hunter <adrian.hunter@xxxxxxxxx> wrote: >> >> On 31/05/24 13:31, Victor Shih wrote: >>> On Fri, May 24, 2024 at 2:54 PM Adrian Hunter <adrian.hunter@xxxxxxxxx> wrote: >>>> >>>> On 22/05/24 14:08, Victor Shih wrote: >>>>> From: Victor Shih <victor.shih@xxxxxxxxxxxxxxxxxxx> >>>>> >>>>> Add UHS-II Auto Command Error Recovery functionality >>>>> into the MMC request processing flow. >>>> >>>> Not sure what "auto" means here, but the commit message >>>> should outline what the spec. requires for error recovery. >>>> >>> >>> Hi, Adrian >>> >>> I will add instructions in the v17 version. >>> >>> Thanks, Victor Shih >>> >>>>> >>>>> Signed-off-by: Ben Chuang <ben.chuang@xxxxxxxxxxxxxxxxxxx> >>>>> Signed-off-by: Victor Shih <victor.shih@xxxxxxxxxxxxxxxxxxx> >>>>> --- >>>>> >>>>> Updates in V16: >>>>> - Separate the Error Recovery mechanism from patch#7 to patch#8. >>>>> >>>>> --- >>>>> >>>>> drivers/mmc/core/core.c | 4 ++ >>>>> drivers/mmc/core/core.h | 1 + >>>>> drivers/mmc/core/sd_uhs2.c | 80 ++++++++++++++++++++++++++++++++++++++ >>>>> include/linux/mmc/host.h | 6 +++ >>>>> 4 files changed, 91 insertions(+) >>>>> >>>>> diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c >>>>> index 68496c51a521..18642afc405f 100644 >>>>> --- a/drivers/mmc/core/core.c >>>>> +++ b/drivers/mmc/core/core.c >>>>> @@ -403,6 +403,10 @@ void mmc_wait_for_req_done(struct mmc_host *host, struct mmc_request *mrq) >>>>> while (1) { >>>>> wait_for_completion(&mrq->completion); >>>>> >>>>> + if (host->ops->get_cd(host)) >>>>> + if (mrq->cmd->error || (mrq->data && mrq->data->error)) >>>>> + mmc_sd_uhs2_error_recovery(host, mrq); >>>> >>>> There are several issues with this: >>>> >>>> 1. It is not OK to start a request from within the request path >>>> because it is recursive: >>>> >>>> mmc_wait_for_req_done() <-- >>>> mmc_sd_uhs2_error_recovery() >>>> sd_uhs2_abort_trans() >>>> mmc_wait_for_cmd() >>>> mmc_wait_for_req() >>>> mmc_wait_for_req_done() <-- >>>> >>>> 2. The mmc block driver does not use this path >>>> >>>> 3. No need to always call ->get_cd() if there is no error >>>> >>>> It is worth considering whether the host controller could >>>> send the abort command as part of the original request, as >>>> is done with the stop command. >>>> >>> >>> Hi, Adrian >>> >>> 1. It looks like just issuing a command in >>> mmc_wait_for_req_done() will cause a recursion. >>> I will drop sd_uhs2_abort_trans() and >>> sd_uhs2_abort_status_read() in the v17 version. >>> 2. I have no idea about this part, could you please give me some advice? >> >> The mmc block driver sets the ->done() callback and so >> mmc_wait_for_req_done() is never called for data transfers. >> >> That won't matter if the host controller handles doing >> the abort command, as was suggested elsewhere. >> >>> 3. I will try to modify this part in the v17 version. >>> >>> Thanks, Victor Shih >>> >>>>> + >>>>> cmd = mrq->cmd; >>>>> >>>>> if (!cmd->error || !cmd->retries || >>>>> diff --git a/drivers/mmc/core/core.h b/drivers/mmc/core/core.h >>>>> index 920323faa834..259d47c8bb19 100644 >>>>> --- a/drivers/mmc/core/core.h >>>>> +++ b/drivers/mmc/core/core.h >>>>> @@ -82,6 +82,7 @@ int mmc_attach_mmc(struct mmc_host *host); >>>>> int mmc_attach_sd(struct mmc_host *host); >>>>> int mmc_attach_sdio(struct mmc_host *host); >>>>> int mmc_attach_sd_uhs2(struct mmc_host *host); >>>>> +void mmc_sd_uhs2_error_recovery(struct mmc_host *mmc, struct mmc_request *mrq); >>>>> >>>>> /* Module parameters */ >>>>> extern bool use_spi_crc; >>>>> diff --git a/drivers/mmc/core/sd_uhs2.c b/drivers/mmc/core/sd_uhs2.c >>>>> index 85939a2582dc..d5acb4e6ccac 100644 >>>>> --- a/drivers/mmc/core/sd_uhs2.c >>>>> +++ b/drivers/mmc/core/sd_uhs2.c >>>>> @@ -1324,3 +1324,83 @@ int mmc_attach_sd_uhs2(struct mmc_host *host) >>>>> >>>>> return err; >>>>> } >>>>> + >>>>> +static void sd_uhs2_abort_trans(struct mmc_host *mmc) >>>>> +{ >>>>> + struct mmc_request mrq = {}; >>>>> + struct mmc_command cmd = {0}; >>>>> + struct uhs2_command uhs2_cmd = {}; >>>>> + int err; >>>>> + >>>>> + mrq.cmd = &cmd; >>>>> + mmc->ongoing_mrq = &mrq; >>>>> + >>>>> + uhs2_cmd.header = UHS2_NATIVE_PACKET | UHS2_PACKET_TYPE_CCMD | >>>>> + mmc->card->uhs2_config.node_id; >>>>> + uhs2_cmd.arg = ((UHS2_DEV_CMD_TRANS_ABORT & 0xFF) << 8) | >>>>> + UHS2_NATIVE_CMD_WRITE | >>>>> + (UHS2_DEV_CMD_TRANS_ABORT >> 8); >>>>> + >>>>> + sd_uhs2_cmd_assemble(&cmd, &uhs2_cmd, 0, 0); >>>>> + err = mmc_wait_for_cmd(mmc, &cmd, 0); >>>>> + >>>>> + if (err) >>>>> + pr_err("%s: %s: UHS2 CMD send fail, err= 0x%x!\n", >>>>> + mmc_hostname(mmc), __func__, err); >>>>> +} >>>>> + >>>>> +static void sd_uhs2_abort_status_read(struct mmc_host *mmc) >>>>> +{ >>>>> + struct mmc_request mrq = {}; >>>>> + struct mmc_command cmd = {0}; >>>>> + struct uhs2_command uhs2_cmd = {}; >>>>> + int err; >>>>> + >>>>> + mrq.cmd = &cmd; >>>>> + mmc->ongoing_mrq = &mrq; >>>>> + >>>>> + uhs2_cmd.header = UHS2_NATIVE_PACKET | >>>>> + UHS2_PACKET_TYPE_CCMD | >>>>> + mmc->card->uhs2_config.node_id; >>>>> + uhs2_cmd.arg = ((UHS2_DEV_STATUS_REG & 0xFF) << 8) | >>>>> + UHS2_NATIVE_CMD_READ | >>>>> + UHS2_NATIVE_CMD_PLEN_4B | >>>>> + (UHS2_DEV_STATUS_REG >> 8); >>>>> + >>>>> + sd_uhs2_cmd_assemble(&cmd, &uhs2_cmd, 0, 0); >>>>> + err = mmc_wait_for_cmd(mmc, &cmd, 0); >>>>> + >>>>> + if (err) >>>>> + pr_err("%s: %s: UHS2 CMD send fail, err= 0x%x!\n", >>>>> + mmc_hostname(mmc), __func__, err); >>>>> +} >>>>> + >>>>> +void mmc_sd_uhs2_error_recovery(struct mmc_host *mmc, struct mmc_request *mrq) >>>>> +{ >>>>> + mmc->ops->uhs2_reset_cmd_data(mmc); >>>> >>>> The host controller should already have done any resets needed. >>>> sdhci already has support for doing that - see host->pending_reset >>>> >>> >>> Hi, Adrian >>> >>> I'm not sure what this means. Could you please give me more information? >> >> sdhci_uhs2_request_done() checks sdhci_needs_reset() and does >> sdhci_uhs2_reset(). >> >> sdhci_needs_reset() does not cater for data errors because >> the reset for data errors is done directly in what becomes >> __sdhci_finish_data_common(). >> >> You may need to: >> 1. add a parameter to __sdhci_finish_data_common() to >> skip doing the sdhci reset and instead set >> host->pending_reset >> 2. amend sdhci_uhs2_request_done() to check for data error >> also to decide if a reset is needed >> > > Hi, Adrian > > If there is any mistake in my understanding, please help me correct it. > My understanding is as follows: > > static bool sdhci_uhs2_request_done(struct sdhci_host *host) > { > ... > if (sdhci_needs_reset(host, mrq)) { > ... > if (mrq->cmd->error || (mrq->data && mrq->data->error)) > sdhci_uhs2_reset_cmd_data(host->mmc); > ... > } > ... > } Like this: diff --git a/drivers/mmc/host/sdhci-uhs2.c b/drivers/mmc/host/sdhci-uhs2.c index 47180429448b..3cb5fe1d488c 100644 --- a/drivers/mmc/host/sdhci-uhs2.c +++ b/drivers/mmc/host/sdhci-uhs2.c @@ -581,7 +581,7 @@ static void sdhci_uhs2_finish_data(struct sdhci_host *host) { struct mmc_data *data = host->data; - __sdhci_finish_data_common(host); + __sdhci_finish_data_common(host, true); __sdhci_finish_mrq(host, data->mrq); } @@ -932,6 +932,12 @@ static void sdhci_uhs2_request(struct mmc_host *mmc, struct mmc_request *mrq) * * \*****************************************************************************/ +static bool sdhci_uhs2_needs_reset(struct sdhci_host *host, struct mmc_request *mrq) +{ + return sdhci_needs_reset(host, mrq) || + (!(host->flags & SDHCI_DEVICE_DEAD) && mrq->data && mrq->data->error); +} + static bool sdhci_uhs2_request_done(struct sdhci_host *host) { unsigned long flags; @@ -963,7 +969,7 @@ static bool sdhci_uhs2_request_done(struct sdhci_host *host) * The controller needs a reset of internal state machines * upon error conditions. */ - if (sdhci_needs_reset(host, mrq)) { + if (sdhci_uhs2_needs_reset(host, mrq)) { /* * Do not finish until command and data lines are available for * reset. Note there can only be one other mrq, so it cannot diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c index ed55aab24f92..55f0db0fc007 100644 --- a/drivers/mmc/host/sdhci.c +++ b/drivers/mmc/host/sdhci.c @@ -1563,7 +1563,7 @@ void sdhci_finish_mrq(struct sdhci_host *host, struct mmc_request *mrq) } EXPORT_SYMBOL_GPL(sdhci_finish_mrq); -void __sdhci_finish_data_common(struct sdhci_host *host) +void __sdhci_finish_data_common(struct sdhci_host *host, bool defer_reset) { struct mmc_command *data_cmd = host->data_cmd; struct mmc_data *data = host->data; @@ -1576,7 +1576,9 @@ void __sdhci_finish_data_common(struct sdhci_host *host) * conditions. */ if (data->error) { - if (!host->cmd || host->cmd == data_cmd) + if (defer_reset) + host->pending_reset = true; + else if (!host->cmd || host->cmd == data_cmd) sdhci_reset_for(host, REQUEST_ERROR); else sdhci_reset_for(host, REQUEST_ERROR_DATA_ONLY); @@ -1604,7 +1606,7 @@ static void __sdhci_finish_data(struct sdhci_host *host, bool sw_data_timeout) { struct mmc_data *data = host->data; - __sdhci_finish_data_common(host); + __sdhci_finish_data_common(host, false); /* * Need to send CMD12 if - diff --git a/drivers/mmc/host/sdhci.h b/drivers/mmc/host/sdhci.h index 576b8de2c04e..5ac5234fecf0 100644 --- a/drivers/mmc/host/sdhci.h +++ b/drivers/mmc/host/sdhci.h @@ -840,7 +840,7 @@ void sdhci_prepare_dma(struct sdhci_host *host, struct mmc_data *data); bool sdhci_needs_reset(struct sdhci_host *host, struct mmc_request *mrq); void __sdhci_finish_mrq(struct sdhci_host *host, struct mmc_request *mrq); void sdhci_finish_mrq(struct sdhci_host *host, struct mmc_request *mrq); -void __sdhci_finish_data_common(struct sdhci_host *host); +void __sdhci_finish_data_common(struct sdhci_host *host, bool defer_reset); bool sdhci_present_error(struct sdhci_host *host, struct mmc_command *cmd, bool present); u16 sdhci_calc_clk(struct sdhci_host *host, unsigned int clock, unsigned int *actual_clock); > > I have another question. the sdhci_uhs2_request_done() belongs to the patch#18. > Can the above content be modified directly in the patch#18? > Or does it need to be separated into another patch? Please update the existing patches. > > Thanks, Victor Shih > >>> >>> Thanks, Victor Shih >>> >>>>> + >>>>> + if (mrq->data) { >>>>> + if (mrq->data->error && mmc_card_uhs2(mmc)) { >>>>> + if (mrq->cmd) { >>>>> + switch (mrq->cmd->error) { >>>>> + case ETIMEDOUT: >>>>> + case EILSEQ: >>>>> + case EIO: >>>>> + sd_uhs2_abort_trans(mmc); >>>>> + sd_uhs2_abort_status_read(mmc); >>>> >>>> What is the purpose of sd_uhs2_abort_status_read() here? >>>> It is not obvious it does anything. >>>> >>> >>> Hi, Adrian >>> >>> sd_uhs2_abort_status_read() seems to only have read status, >>> I will drop this in the v17 version. >>> >>> Thanks, Victor Shih >>> >>>>> + break; >>>>> + default: >>>>> + break; >>>>> + } >>>>> + } >>>>> + } >>>>> + } else { >>>>> + if (mrq->cmd) { >>>>> + switch (mrq->cmd->error) { >>>>> + case ETIMEDOUT: >>>>> + sd_uhs2_abort_trans(mmc); >>>>> + break; >>>>> + } >>>>> + } >>>>> + } >>>>> +} >>>>> diff --git a/include/linux/mmc/host.h b/include/linux/mmc/host.h >>>>> index fc9520b3bfa4..c914a58f7e1e 100644 >>>>> --- a/include/linux/mmc/host.h >>>>> +++ b/include/linux/mmc/host.h >>>>> @@ -271,6 +271,12 @@ struct mmc_host_ops { >>>>> * negative errno in case of a failure or zero for success. >>>>> */ >>>>> int (*uhs2_control)(struct mmc_host *host, enum sd_uhs2_operation op); >>>>> + >>>>> + /* >>>>> + * The uhs2_reset_cmd_data callback is used to excute reset >>>>> + * when a auto command error occurs. >>>>> + */ >>>>> + void (*uhs2_reset_cmd_data)(struct mmc_host *host); >>>>> }; >>>>> >>>>> struct mmc_cqe_ops { >>>> >>