RE: [PATCH V1] mmc: core: Enable force hw reset

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> -----Original Message-----
> From: Adrian Hunter <adrian.hunter@xxxxxxxxx>
> Sent: Wednesday, June 1, 2022 11:35 AM
> To: Sarthak Garg (QUIC) <quic_sartgarg@xxxxxxxxxxx>; Kamasali Satyanarayan
> (Consultant) (QUIC) <quic_kamasali@xxxxxxxxxxx>; quic_spathi
> <quic_spathi@xxxxxxxxxxx>; ulf.hansson@xxxxxxxxxx; axboe@xxxxxxxxx;
> avri.altman@xxxxxxx; kch@xxxxxxxxxx; CLoehle@xxxxxxxxxxxxxx;
> swboyd@xxxxxxxxxxxx; digetx@xxxxxxxxx; bigeasy@xxxxxxxxxxxxx; linux-
> mmc@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx
> Subject: Re: [PATCH V1] mmc: core: Enable force hw reset
> 
> On 27/05/22 15:44, Adrian Hunter wrote:
> > On 25/05/22 10:06, Sarthak Garg (QUIC) wrote:
> >> Hi Adrian,
> >>
> >> Thanks for the review.
> >> Please find comments inline.
> >>
> >> Thanks,
> >> Sarthak
> >>
> >>> -----Original Message-----
> >>> From: Kamasali Satyanarayan (Consultant) (QUIC)
> >>> <quic_kamasali@xxxxxxxxxxx>
> >>> Sent: Tuesday, May 24, 2022 5:33 PM
> >>> To: 'Adrian Hunter' <adrian.hunter@xxxxxxxxx>; quic_spathi
> >>> <quic_spathi@xxxxxxxxxxx>; ulf.hansson@xxxxxxxxxx;
> >>> riteshh@xxxxxxxxxxxxxx; asutoshd@xxxxxxxxxxxxxx; axboe@xxxxxxxxx;
> >>> avri.altman@xxxxxxx; kch@xxxxxxxxxx; CLoehle@xxxxxxxxxxxxxx;
> >>> swboyd@xxxxxxxxxxxx; digetx@xxxxxxxxx; bigeasy@xxxxxxxxxxxxx;
> >>> linux-mmc@xxxxxxxxxxxxxxx; linux- kernel@xxxxxxxxxxxxxxx; Sarthak
> >>> Garg (QUIC) <quic_sartgarg@xxxxxxxxxxx>
> >>> Cc: Shaik Sajida Bhanu <sbhanu@xxxxxxxxxxxxxx>
> >>> Subject: RE: [PATCH V1] mmc: core: Enable force hw reset
> >>>
> >>> Hi,
> >>> These patches will be further taken by Sarthak.
> >>>
> >>> Thanks,
> >>> Satya
> >>>
> >>> -----Original Message-----
> >>> From: Adrian Hunter <adrian.hunter@xxxxxxxxx>
> >>> Sent: Wednesday, April 27, 2022 6:04 PM
> >>> To: quic_spathi <quic_spathi@xxxxxxxxxxx>; ulf.hansson@xxxxxxxxxx;
> >>> riteshh@xxxxxxxxxxxxxx; asutoshd@xxxxxxxxxxxxxx; axboe@xxxxxxxxx;
> >>> avri.altman@xxxxxxx; kch@xxxxxxxxxx; CLoehle@xxxxxxxxxxxxxx;
> >>> swboyd@xxxxxxxxxxxx; digetx@xxxxxxxxx; bigeasy@xxxxxxxxxxxxx; linux-
> >>> mmc@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx
> >>> Cc: Shaik Sajida Bhanu <sbhanu@xxxxxxxxxxxxxx>; Kamasali
> >>> Satyanarayan
> >>> (Consultant) (QUIC) <quic_kamasali@xxxxxxxxxxx>
> >>> Subject: Re: [PATCH V1] mmc: core: Enable force hw reset
> >>>
> >>> On 26/04/22 11:30, Srinivasarao Pathipati wrote:
> >>>> From: Shaik Sajida Bhanu <sbhanu@xxxxxxxxxxxxxx>
> >>>>
> >>>> During error recovery set need hw reset to handle ICE error where
> >>>> cqe reset is must.
> >>>
> >>> How do you get ICE errors?  Doesn't it mean either the hardware is
> >>> broken or the configuration is broken?
> >>
> >> This patch is not intended for ice errors and will update the commit text in V2.
> >> Long back intermittent recovery failures were observed but after forcing
> hardware reset during error recovery we have no single instance of recovery
> failure. This have made recovery more robust for us.
> >> Any suggestions on how we can take it forward will be highly appreciated.
> >
> > We can definitely go forward, but with hopefully a little more
> > explanation first.
> >
> > It is preferable to be able to explain why changes are being made.
> >
> > Do you have any logs or other information on the recovery failures?
> > Are you able to reproduce the problem?
> >
> > I notice you always do mmc_blk_reset_success().  Does that mean you
> > sometimes need several resets in a row?
> >
> > A potential issue that I notice, is that the recovery does not
> > explicitly deal with the case that the card's command queue has been
> > disabled e.g. due to RPMB access or IOCTL commands. Are you using
> > either of those?
> 
> Looking closer, the command queue is reenabled on the error path so that is not
> a concern, but I did send:
> 
> https://lore.kernel.org/linux-mmc/20220531171922.76080-1-
> adrian.hunter@xxxxxxxxx/
> 
> For your case, if it is not about ICE errors, why not add only:
> 
> diff --git a/drivers/mmc/core/block.c b/drivers/mmc/core/block.c index
> f4a1281658db..a2ee850a5c16 100644
> --- a/drivers/mmc/core/block.c
> +++ b/drivers/mmc/core/block.c
> @@ -1497,7 +1497,7 @@ void mmc_blk_cqe_recovery(struct mmc_queue *mq)
>  	pr_debug("%s: CQE recovery start\n", mmc_hostname(host));
> 
>  	err = mmc_cqe_recovery(host);
> -	if (err)
> +	if (err || host->cqe_recovery_reset_always)
>  		mmc_blk_reset(mq->blkdata, host,
> MMC_BLK_CQE_RECOVERY);
>  	mmc_blk_reset_success(mq->blkdata, MMC_BLK_CQE_RECOVERY);
> 
> And then just set it in your host controller driver probe function.
> 
> 	host->cqe_recovery_reset_always = true;
> 
> >
Thanks Adrian for considering and posting https://lore.kernel.org/linux-mmc/20220531171922.76080-1-adrian.hunter@xxxxxxxxx/ and for the above suggestion as well.
Will soon post the updated patches.
> >>>
> >>>>
> >>>> Signed-off-by: Shaik Sajida Bhanu <sbhanu@xxxxxxxxxxxxxx>
> >>>> Signed-off-by: kamasali <quic_kamasali@xxxxxxxxxxx>
> >>>> Signed-off-by: Srinivasarao Pathipati <quic_spathi@xxxxxxxxxxx>
> >>>> ---
> >>>>  drivers/mmc/core/block.c      | 8 +++++---
> >>>>  drivers/mmc/host/cqhci-core.c | 7 +++++--
> >>>>  include/linux/mmc/host.h      | 1 +
> >>>>  3 files changed, 11 insertions(+), 5 deletions(-)
> >>>>
> >>>> diff --git a/drivers/mmc/core/block.c b/drivers/mmc/core/block.c
> >>>> index
> >>>> b35e7a9..f63bf33 100644
> >>>> --- a/drivers/mmc/core/block.c
> >>>> +++ b/drivers/mmc/core/block.c
> >>>> @@ -1482,10 +1482,12 @@ void mmc_blk_cqe_recovery(struct
> mmc_queue
> >>> *mq)
> >>>>  	pr_debug("%s: CQE recovery start\n", mmc_hostname(host));
> >>>>
> >>>>  	err = mmc_cqe_recovery(host);
> >>>> -	if (err)
> >>>> +	if (err || host->need_hw_reset) {
> >>>>  		mmc_blk_reset(mq->blkdata, host,
> >>> MMC_BLK_CQE_RECOVERY);
> >>>> -	else
> >>>> -		mmc_blk_reset_success(mq->blkdata,
> >>> MMC_BLK_CQE_RECOVERY);
> >>>> +		if (host->need_hw_reset)
> >>>> +			host->need_hw_reset = false;
> >>>> +	}
> >>>> +	mmc_blk_reset_success(mq->blkdata, MMC_BLK_CQE_RECOVERY);
> >>>>
> >>>>  	pr_debug("%s: CQE recovery done\n", mmc_hostname(host));  } diff
> >>>> --git a/drivers/mmc/host/cqhci-core.c
> >>>> b/drivers/mmc/host/cqhci-core.c index b0d30c3..311b510 100644
> >>>> --- a/drivers/mmc/host/cqhci-core.c
> >>>> +++ b/drivers/mmc/host/cqhci-core.c
> >>>> @@ -812,18 +812,21 @@ static void cqhci_finish_mrq(struct mmc_host
> >>>> *mmc, unsigned int tag)  irqreturn_t cqhci_irq(struct mmc_host
> >>>> *mmc, u32
> >>> intmask, int cmd_error,
> >>>>  		      int data_error)
> >>>>  {
> >>>> -	u32 status;
> >>>> +	u32 status, ice_err;
> >>>>  	unsigned long tag = 0, comp_status;
> >>>>  	struct cqhci_host *cq_host = mmc->cqe_private;
> >>>>
> >>>>  	status = cqhci_readl(cq_host, CQHCI_IS);
> >>>>  	cqhci_writel(cq_host, status, CQHCI_IS);
> >>>> +	ice_err = status & (CQHCI_IS_GCE | CQHCI_IS_ICCE);
> >>>>
> >>>>  	pr_debug("%s: cqhci: IRQ status: 0x%08x\n", mmc_hostname(mmc),
> >>>> status);
> >>>>
> >>>>  	if ((status & (CQHCI_IS_RED | CQHCI_IS_GCE | CQHCI_IS_ICCE)) ||
> >>>> -	    cmd_error || data_error)
> >>>> +	    cmd_error || data_error || ice_err){
> >>>> +		mmc->need_hw_reset = true;
> >>>>  		cqhci_error_irq(mmc, status, cmd_error, data_error);
> >>>> +	}
> >>>>
> >>>>  	if (status & CQHCI_IS_TCC) {
> >>>>  		/* read TCN and complete the request */ diff --git
> >>>> a/include/linux/mmc/host.h b/include/linux/mmc/host.h index
> >>>> c193c50..3d00bcf 100644
> >>>> --- a/include/linux/mmc/host.h
> >>>> +++ b/include/linux/mmc/host.h
> >>>> @@ -492,6 +492,7 @@ struct mmc_host {
> >>>>  	int			cqe_qdepth;
> >>>>  	bool			cqe_enabled;
> >>>>  	bool			cqe_on;
> >>>> +	bool                    need_hw_reset;
> >>>>
> >>>>  	/* Inline encryption support */
> >>>>  #ifdef CONFIG_MMC_CRYPTO
> >>>
> >>
> >





[Index of Archives]     [Linux Memonry Technology]     [Linux USB Devel]     [Linux Media]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux