On 18/06/24 13:47, Victor Shih wrote: > On Mon, Jun 17, 2024 at 1:04 PM Adrian Hunter <adrian.hunter@xxxxxxxxx> wrote: >> >> On 9/06/24 21:40, Victor Shih wrote: >>> On Fri, May 31, 2024 at 7:23 PM Adrian Hunter <adrian.hunter@xxxxxxxxx> wrote: >>>> >>>> On 31/05/24 13:31, Victor Shih wrote: >>>>> On Fri, May 24, 2024 at 2:54 PM Adrian Hunter <adrian.hunter@xxxxxxxxx> wrote: >>>>>> >>>>>> On 22/05/24 14:08, Victor Shih wrote: >>>>>>> From: Victor Shih <victor.shih@xxxxxxxxxxxxxxxxxxx> >>>>>>> >>>>>>> Add UHS-II Auto Command Error Recovery functionality >>>>>>> into the MMC request processing flow. >>>>>> >>>>>> Not sure what "auto" means here, but the commit message >>>>>> should outline what the spec. requires for error recovery. >>>>>> >>>>> >>>>> Hi, Adrian >>>>> >>>>> I will add instructions in the v17 version. >>>>> >>>>> Thanks, Victor Shih >>>>> >>>>>>> >>>>>>> Signed-off-by: Ben Chuang <ben.chuang@xxxxxxxxxxxxxxxxxxx> >>>>>>> Signed-off-by: Victor Shih <victor.shih@xxxxxxxxxxxxxxxxxxx> >>>>>>> --- >>>>>>> >>>>>>> Updates in V16: >>>>>>> - Separate the Error Recovery mechanism from patch#7 to patch#8. >>>>>>> >>>>>>> --- >>>>>>> >>>>>>> drivers/mmc/core/core.c | 4 ++ >>>>>>> drivers/mmc/core/core.h | 1 + >>>>>>> drivers/mmc/core/sd_uhs2.c | 80 ++++++++++++++++++++++++++++++++++++++ >>>>>>> include/linux/mmc/host.h | 6 +++ >>>>>>> 4 files changed, 91 insertions(+) >>>>>>> >>>>>>> diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c >>>>>>> index 68496c51a521..18642afc405f 100644 >>>>>>> --- a/drivers/mmc/core/core.c >>>>>>> +++ b/drivers/mmc/core/core.c >>>>>>> @@ -403,6 +403,10 @@ void mmc_wait_for_req_done(struct mmc_host *host, struct mmc_request *mrq) >>>>>>> while (1) { >>>>>>> wait_for_completion(&mrq->completion); >>>>>>> >>>>>>> + if (host->ops->get_cd(host)) >>>>>>> + if (mrq->cmd->error || (mrq->data && mrq->data->error)) >>>>>>> + mmc_sd_uhs2_error_recovery(host, mrq); >>>>>> >>>>>> There are several issues with this: >>>>>> >>>>>> 1. It is not OK to start a request from within the request path >>>>>> because it is recursive: >>>>>> >>>>>> mmc_wait_for_req_done() <-- >>>>>> mmc_sd_uhs2_error_recovery() >>>>>> sd_uhs2_abort_trans() >>>>>> mmc_wait_for_cmd() >>>>>> mmc_wait_for_req() >>>>>> mmc_wait_for_req_done() <-- >>>>>> >>>>>> 2. The mmc block driver does not use this path >>>>>> >>>>>> 3. No need to always call ->get_cd() if there is no error >>>>>> >>>>>> It is worth considering whether the host controller could >>>>>> send the abort command as part of the original request, as >>>>>> is done with the stop command. >>>>>> >>>>> >>>>> Hi, Adrian >>>>> >>>>> 1. It looks like just issuing a command in >>>>> mmc_wait_for_req_done() will cause a recursion. >>>>> I will drop sd_uhs2_abort_trans() and >>>>> sd_uhs2_abort_status_read() in the v17 version. >>>>> 2. I have no idea about this part, could you please give me some advice? >>>> >>>> The mmc block driver sets the ->done() callback and so >>>> mmc_wait_for_req_done() is never called for data transfers. >>>> >>>> That won't matter if the host controller handles doing >>>> the abort command, as was suggested elsewhere. >>>> >>>>> 3. I will try to modify this part in the v17 version. >>>>> >>>>> Thanks, Victor Shih >>>>> >>>>>>> + >>>>>>> cmd = mrq->cmd; >>>>>>> >>>>>>> if (!cmd->error || !cmd->retries || >>>>>>> diff --git a/drivers/mmc/core/core.h b/drivers/mmc/core/core.h >>>>>>> index 920323faa834..259d47c8bb19 100644 >>>>>>> --- a/drivers/mmc/core/core.h >>>>>>> +++ b/drivers/mmc/core/core.h >>>>>>> @@ -82,6 +82,7 @@ int mmc_attach_mmc(struct mmc_host *host); >>>>>>> int mmc_attach_sd(struct mmc_host *host); >>>>>>> int mmc_attach_sdio(struct mmc_host *host); >>>>>>> int mmc_attach_sd_uhs2(struct mmc_host *host); >>>>>>> +void mmc_sd_uhs2_error_recovery(struct mmc_host *mmc, struct mmc_request *mrq); >>>>>>> >>>>>>> /* Module parameters */ >>>>>>> extern bool use_spi_crc; >>>>>>> diff --git a/drivers/mmc/core/sd_uhs2.c b/drivers/mmc/core/sd_uhs2.c >>>>>>> index 85939a2582dc..d5acb4e6ccac 100644 >>>>>>> --- a/drivers/mmc/core/sd_uhs2.c >>>>>>> +++ b/drivers/mmc/core/sd_uhs2.c >>>>>>> @@ -1324,3 +1324,83 @@ int mmc_attach_sd_uhs2(struct mmc_host *host) >>>>>>> >>>>>>> return err; >>>>>>> } >>>>>>> + >>>>>>> +static void sd_uhs2_abort_trans(struct mmc_host *mmc) >>>>>>> +{ >>>>>>> + struct mmc_request mrq = {}; >>>>>>> + struct mmc_command cmd = {0}; >>>>>>> + struct uhs2_command uhs2_cmd = {}; >>>>>>> + int err; >>>>>>> + >>>>>>> + mrq.cmd = &cmd; >>>>>>> + mmc->ongoing_mrq = &mrq; >>>>>>> + >>>>>>> + uhs2_cmd.header = UHS2_NATIVE_PACKET | UHS2_PACKET_TYPE_CCMD | >>>>>>> + mmc->card->uhs2_config.node_id; >>>>>>> + uhs2_cmd.arg = ((UHS2_DEV_CMD_TRANS_ABORT & 0xFF) << 8) | >>>>>>> + UHS2_NATIVE_CMD_WRITE | >>>>>>> + (UHS2_DEV_CMD_TRANS_ABORT >> 8); >>>>>>> + >>>>>>> + sd_uhs2_cmd_assemble(&cmd, &uhs2_cmd, 0, 0); >>>>>>> + err = mmc_wait_for_cmd(mmc, &cmd, 0); >>>>>>> + >>>>>>> + if (err) >>>>>>> + pr_err("%s: %s: UHS2 CMD send fail, err= 0x%x!\n", >>>>>>> + mmc_hostname(mmc), __func__, err); >>>>>>> +} >>>>>>> + >>>>>>> +static void sd_uhs2_abort_status_read(struct mmc_host *mmc) >>>>>>> +{ >>>>>>> + struct mmc_request mrq = {}; >>>>>>> + struct mmc_command cmd = {0}; >>>>>>> + struct uhs2_command uhs2_cmd = {}; >>>>>>> + int err; >>>>>>> + >>>>>>> + mrq.cmd = &cmd; >>>>>>> + mmc->ongoing_mrq = &mrq; >>>>>>> + >>>>>>> + uhs2_cmd.header = UHS2_NATIVE_PACKET | >>>>>>> + UHS2_PACKET_TYPE_CCMD | >>>>>>> + mmc->card->uhs2_config.node_id; >>>>>>> + uhs2_cmd.arg = ((UHS2_DEV_STATUS_REG & 0xFF) << 8) | >>>>>>> + UHS2_NATIVE_CMD_READ | >>>>>>> + UHS2_NATIVE_CMD_PLEN_4B | >>>>>>> + (UHS2_DEV_STATUS_REG >> 8); >>>>>>> + >>>>>>> + sd_uhs2_cmd_assemble(&cmd, &uhs2_cmd, 0, 0); >>>>>>> + err = mmc_wait_for_cmd(mmc, &cmd, 0); >>>>>>> + >>>>>>> + if (err) >>>>>>> + pr_err("%s: %s: UHS2 CMD send fail, err= 0x%x!\n", >>>>>>> + mmc_hostname(mmc), __func__, err); >>>>>>> +} >>>>>>> + >>>>>>> +void mmc_sd_uhs2_error_recovery(struct mmc_host *mmc, struct mmc_request *mrq) >>>>>>> +{ >>>>>>> + mmc->ops->uhs2_reset_cmd_data(mmc); >>>>>> >>>>>> The host controller should already have done any resets needed. >>>>>> sdhci already has support for doing that - see host->pending_reset >>>>>> >>>>> >>>>> Hi, Adrian >>>>> >>>>> I'm not sure what this means. Could you please give me more information? >>>> >>>> sdhci_uhs2_request_done() checks sdhci_needs_reset() and does >>>> sdhci_uhs2_reset(). >>>> >>>> sdhci_needs_reset() does not cater for data errors because >>>> the reset for data errors is done directly in what becomes >>>> __sdhci_finish_data_common(). >>>> >>>> You may need to: >>>> 1. add a parameter to __sdhci_finish_data_common() to >>>> skip doing the sdhci reset and instead set >>>> host->pending_reset >>>> 2. amend sdhci_uhs2_request_done() to check for data error >>>> also to decide if a reset is needed >>>> >>> >>> Hi, Adrian >>> >>> If there is any mistake in my understanding, please help me correct it. >>> My understanding is as follows: >>> >>> static bool sdhci_uhs2_request_done(struct sdhci_host *host) >>> { >>> ... >>> if (sdhci_needs_reset(host, mrq)) { >>> ... >>> if (mrq->cmd->error || (mrq->data && mrq->data->error)) >>> sdhci_uhs2_reset_cmd_data(host->mmc); >>> ... >>> } >>> ... >>> } >> >> Like this: >> >> diff --git a/drivers/mmc/host/sdhci-uhs2.c b/drivers/mmc/host/sdhci-uhs2.c >> index 47180429448b..3cb5fe1d488c 100644 >> --- a/drivers/mmc/host/sdhci-uhs2.c >> +++ b/drivers/mmc/host/sdhci-uhs2.c >> @@ -581,7 +581,7 @@ static void sdhci_uhs2_finish_data(struct sdhci_host *host) >> { >> struct mmc_data *data = host->data; >> >> - __sdhci_finish_data_common(host); >> + __sdhci_finish_data_common(host, true); >> >> __sdhci_finish_mrq(host, data->mrq); >> } >> @@ -932,6 +932,12 @@ static void sdhci_uhs2_request(struct mmc_host *mmc, struct mmc_request *mrq) >> * * >> \*****************************************************************************/ >> >> +static bool sdhci_uhs2_needs_reset(struct sdhci_host *host, struct mmc_request *mrq) >> +{ >> + return sdhci_needs_reset(host, mrq) || >> + (!(host->flags & SDHCI_DEVICE_DEAD) && mrq->data && mrq->data->error); >> +} >> + >> static bool sdhci_uhs2_request_done(struct sdhci_host *host) >> { >> unsigned long flags; >> @@ -963,7 +969,7 @@ static bool sdhci_uhs2_request_done(struct sdhci_host *host) >> * The controller needs a reset of internal state machines >> * upon error conditions. >> */ >> - if (sdhci_needs_reset(host, mrq)) { >> + if (sdhci_uhs2_needs_reset(host, mrq)) { >> /* >> * Do not finish until command and data lines are available for >> * reset. Note there can only be one other mrq, so it cannot >> diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c >> index ed55aab24f92..55f0db0fc007 100644 >> --- a/drivers/mmc/host/sdhci.c >> +++ b/drivers/mmc/host/sdhci.c >> @@ -1563,7 +1563,7 @@ void sdhci_finish_mrq(struct sdhci_host *host, struct mmc_request *mrq) >> } >> EXPORT_SYMBOL_GPL(sdhci_finish_mrq); >> >> -void __sdhci_finish_data_common(struct sdhci_host *host) >> +void __sdhci_finish_data_common(struct sdhci_host *host, bool defer_reset) >> { >> struct mmc_command *data_cmd = host->data_cmd; >> struct mmc_data *data = host->data; >> @@ -1576,7 +1576,9 @@ void __sdhci_finish_data_common(struct sdhci_host *host) >> * conditions. >> */ >> if (data->error) { >> - if (!host->cmd || host->cmd == data_cmd) >> + if (defer_reset) >> + host->pending_reset = true; >> + else if (!host->cmd || host->cmd == data_cmd) >> sdhci_reset_for(host, REQUEST_ERROR); >> else >> sdhci_reset_for(host, REQUEST_ERROR_DATA_ONLY); >> @@ -1604,7 +1606,7 @@ static void __sdhci_finish_data(struct sdhci_host *host, bool sw_data_timeout) >> { >> struct mmc_data *data = host->data; >> >> - __sdhci_finish_data_common(host); >> + __sdhci_finish_data_common(host, false); >> >> /* >> * Need to send CMD12 if - >> diff --git a/drivers/mmc/host/sdhci.h b/drivers/mmc/host/sdhci.h >> index 576b8de2c04e..5ac5234fecf0 100644 >> --- a/drivers/mmc/host/sdhci.h >> +++ b/drivers/mmc/host/sdhci.h >> @@ -840,7 +840,7 @@ void sdhci_prepare_dma(struct sdhci_host *host, struct mmc_data *data); >> bool sdhci_needs_reset(struct sdhci_host *host, struct mmc_request *mrq); >> void __sdhci_finish_mrq(struct sdhci_host *host, struct mmc_request *mrq); >> void sdhci_finish_mrq(struct sdhci_host *host, struct mmc_request *mrq); >> -void __sdhci_finish_data_common(struct sdhci_host *host); >> +void __sdhci_finish_data_common(struct sdhci_host *host, bool defer_reset); >> bool sdhci_present_error(struct sdhci_host *host, struct mmc_command *cmd, bool present); >> u16 sdhci_calc_clk(struct sdhci_host *host, unsigned int clock, >> unsigned int *actual_clock); >> >> > > Hi, Adrian > > Please let me confirm with you. Based on your above comments, will the > sdhci_uhs2_request_done() > be modified to option 1 or option 2? > After testing, when a command error occurs, only executing > sdhci_uhs2_reset() has no effect, > we need to execute the reset DAT Line and CMD Line. So option 3 has no effect. Obviously do whatever reset is necessary. Don't use pending_reset to differentiate which reset, because it doesn't mean anything other than what it says. Instead look at the mrq->data, mrq->data->error etc > > option 1: > static bool sdhci_uhs2_request_done(struct sdhci_host *host) > { > ... > if (sdhci_uhs2_needs_reset(host, mrq)) { > ... > if (host->pending_reset) > sdhci_uhs2_reset_cmd_data(host->mmc); > else > sdhci_uhs2_reset(host, SDHCI_UHS2_SW_RESET); > host->pending_reset = false; > } > ... > } > > option 2: > static bool sdhci_uhs2_request_done(struct sdhci_host *host) > { > ... > if (sdhci_uhs2_needs_reset(host, mrq)) { > ... > sdhci_uhs2_reset_cmd_data(host->mmc); > host->pending_reset = false; > } > ... > } > > option 3: > static bool sdhci_uhs2_request_done(struct sdhci_host *host) > { > ... > if (sdhci_uhs2_needs_reset(host, mrq)) { > ... > sdhci_uhs2_reset(host, SDHCI_UHS2_SW_RESET); > host->pending_reset = false; > } > ... > } > > Thanks, Victor Shih > >>> >>> I have another question. the sdhci_uhs2_request_done() belongs to the patch#18. >>> Can the above content be modified directly in the patch#18? >>> Or does it need to be separated into another patch? >> >> Please update the existing patches. >> >>> >>> Thanks, Victor Shih >>> >>>>> >>>>> Thanks, Victor Shih >>>>> >>>>>>> + >>>>>>> + if (mrq->data) { >>>>>>> + if (mrq->data->error && mmc_card_uhs2(mmc)) { >>>>>>> + if (mrq->cmd) { >>>>>>> + switch (mrq->cmd->error) { >>>>>>> + case ETIMEDOUT: >>>>>>> + case EILSEQ: >>>>>>> + case EIO: >>>>>>> + sd_uhs2_abort_trans(mmc); >>>>>>> + sd_uhs2_abort_status_read(mmc); >>>>>> >>>>>> What is the purpose of sd_uhs2_abort_status_read() here? >>>>>> It is not obvious it does anything. >>>>>> >>>>> >>>>> Hi, Adrian >>>>> >>>>> sd_uhs2_abort_status_read() seems to only have read status, >>>>> I will drop this in the v17 version. >>>>> >>>>> Thanks, Victor Shih >>>>> >>>>>>> + break; >>>>>>> + default: >>>>>>> + break; >>>>>>> + } >>>>>>> + } >>>>>>> + } >>>>>>> + } else { >>>>>>> + if (mrq->cmd) { >>>>>>> + switch (mrq->cmd->error) { >>>>>>> + case ETIMEDOUT: >>>>>>> + sd_uhs2_abort_trans(mmc); >>>>>>> + break; >>>>>>> + } >>>>>>> + } >>>>>>> + } >>>>>>> +} >>>>>>> diff --git a/include/linux/mmc/host.h b/include/linux/mmc/host.h >>>>>>> index fc9520b3bfa4..c914a58f7e1e 100644 >>>>>>> --- a/include/linux/mmc/host.h >>>>>>> +++ b/include/linux/mmc/host.h >>>>>>> @@ -271,6 +271,12 @@ struct mmc_host_ops { >>>>>>> * negative errno in case of a failure or zero for success. >>>>>>> */ >>>>>>> int (*uhs2_control)(struct mmc_host *host, enum sd_uhs2_operation op); >>>>>>> + >>>>>>> + /* >>>>>>> + * The uhs2_reset_cmd_data callback is used to excute reset >>>>>>> + * when a auto command error occurs. >>>>>>> + */ >>>>>>> + void (*uhs2_reset_cmd_data)(struct mmc_host *host); >>>>>>> }; >>>>>>> >>>>>>> struct mmc_cqe_ops { >>>>>> >>>> >>