RE: [PATCH V4 09/11] mmc: block: Add CQE support

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> -----Original Message-----
> From: Adrian Hunter [mailto:adrian.hunter@xxxxxxxxx]
> Sent: Thursday, August 10, 2017 6:19 PM
> To: Bough Chen <haibo.chen@xxxxxxx>; Shawn Lin <shawn.lin@rock-
> chips.com>
> Cc: Ulf Hansson <ulf.hansson@xxxxxxxxxx>; linux-mmc <linux-
> mmc@xxxxxxxxxxxxxxx>; Alex Lemberg <alex.lemberg@xxxxxxxxxxx>; Mateusz
> Nowak <mateusz.nowak@xxxxxxxxx>; Yuliy Izrailov
> <Yuliy.Izrailov@xxxxxxxxxxx>; Jaehoon Chung <jh80.chung@xxxxxxxxxxx>;
> Dong Aisheng <dongas86@xxxxxxxxx>; Das Asutosh
> <asutoshd@xxxxxxxxxxxxxx>; Zhangfei Gao <zhangfei.gao@xxxxxxxxx>;
> Dorfman Konstantin <kdorfman@xxxxxxxxxxxxxx>; Sahitya Tummala
> <stummala@xxxxxxxxxxxxxx>; Harjani Ritesh <riteshh@xxxxxxxxxxxxxx>; Venu
> Byravarasu <vbyravarasu@xxxxxxxxxx>; Linus Walleij <linus.walleij@xxxxxxxxxx>
> Subject: Re: [PATCH V4 09/11] mmc: block: Add CQE support
> 
> On 09/08/17 15:45, Adrian Hunter wrote:
> > On 08/09/2017 01:35 PM, Bough Chen wrote:
> >>> -----Original Message-----
> >>> From: linux-mmc-owner@xxxxxxxxxxxxxxx [mailto:linux-mmc-
> >>> owner@xxxxxxxxxxxxxxx] On Behalf Of Bough Chen
> >>> Sent: Wednesday, August 09, 2017 5:42 PM
> >>> To: Adrian Hunter <adrian.hunter@xxxxxxxxx>; Shawn Lin
> >>> <shawn.lin@rock- chips.com>
> >>> Cc: Ulf Hansson <ulf.hansson@xxxxxxxxxx>; linux-mmc <linux-
> >>> mmc@xxxxxxxxxxxxxxx>; Alex Lemberg <alex.lemberg@xxxxxxxxxxx>;
> >>> Mateusz Nowak <mateusz.nowak@xxxxxxxxx>; Yuliy Izrailov
> >>> <Yuliy.Izrailov@xxxxxxxxxxx>; Jaehoon Chung
> >>> <jh80.chung@xxxxxxxxxxx>; Dong Aisheng <dongas86@xxxxxxxxx>; Das
> >>> Asutosh <asutoshd@xxxxxxxxxxxxxx>; Zhangfei Gao
> >>> <zhangfei.gao@xxxxxxxxx>; Dorfman Konstantin
> >>> <kdorfman@xxxxxxxxxxxxxx>; Sahitya Tummala
> >>> <stummala@xxxxxxxxxxxxxx>; Harjani Ritesh <riteshh@xxxxxxxxxxxxxx>;
> >>> Venu Byravarasu <vbyravarasu@xxxxxxxxxx>; Linus Walleij
> >>> <linus.walleij@xxxxxxxxxx>
> >>> Subject: RE: [PATCH V4 09/11] mmc: block: Add CQE support
> >>>
> >>>> -----Original Message-----
> >>>> From: linux-mmc-owner@xxxxxxxxxxxxxxx [mailto:linux-mmc-
> >>>> owner@xxxxxxxxxxxxxxx] On Behalf Of Adrian Hunter
> >>>> Sent: Wednesday, August 09, 2017 4:31 PM
> >>>> To: Bough Chen <haibo.chen@xxxxxxx>; Shawn Lin <shawn.lin@rock-
> >>>> chips.com>
> >>>> Cc: Ulf Hansson <ulf.hansson@xxxxxxxxxx>; linux-mmc <linux-
> >>>> mmc@xxxxxxxxxxxxxxx>; Alex Lemberg <alex.lemberg@xxxxxxxxxxx>;
> >>> Mateusz
> >>>> Nowak <mateusz.nowak@xxxxxxxxx>; Yuliy Izrailov
> >>>> <Yuliy.Izrailov@xxxxxxxxxxx>; Jaehoon Chung
> >>>> <jh80.chung@xxxxxxxxxxx>; Dong Aisheng <dongas86@xxxxxxxxx>;
> Das
> >>>> Asutosh <asutoshd@xxxxxxxxxxxxxx>; Zhangfei Gao
> >>>> <zhangfei.gao@xxxxxxxxx>; Dorfman Konstantin
> >>>> <kdorfman@xxxxxxxxxxxxxx>; Sahitya Tummala
> >>>> <stummala@xxxxxxxxxxxxxx>; Harjani Ritesh <riteshh@xxxxxxxxxxxxxx>;
> >>>> Venu Byravarasu <vbyravarasu@xxxxxxxxxx>; Linus Walleij
> >>>> <linus.walleij@xxxxxxxxxx>
> >>>> Subject: Re: [PATCH V4 09/11] mmc: block: Add CQE support
> >>>>
> >>>> On 09/08/17 11:16, Adrian Hunter wrote:
> >>>>> On 09/08/17 10:57, Bough Chen wrote:
> >>>>>>> -----Original Message-----
> >>>>>>> From: linux-mmc-owner@xxxxxxxxxxxxxxx [mailto:linux-mmc-
> >>>>>>> owner@xxxxxxxxxxxxxxx] On Behalf Of Adrian Hunter
> >>>>>>> Sent: Wednesday, August 09, 2017 1:58 PM
> >>>>>>> To: Shawn Lin <shawn.lin@xxxxxxxxxxxxxx>; Bough Chen
> >>>>>>> <haibo.chen@xxxxxxx>
> >>>>>>> Cc: Ulf Hansson <ulf.hansson@xxxxxxxxxx>; linux-mmc <linux-
> >>>>>>> mmc@xxxxxxxxxxxxxxx>; Alex Lemberg <alex.lemberg@xxxxxxxxxxx>;
> >>>>>>> Mateusz Nowak <mateusz.nowak@xxxxxxxxx>; Yuliy Izrailov
> >>>>>>> <Yuliy.Izrailov@xxxxxxxxxxx>; Jaehoon Chung
> >>>>>>> <jh80.chung@xxxxxxxxxxx>; Dong Aisheng <dongas86@xxxxxxxxx>;
> >>> Das
> >>>>>>> Asutosh <asutoshd@xxxxxxxxxxxxxx>; Zhangfei Gao
> >>>>>>> <zhangfei.gao@xxxxxxxxx>; Dorfman Konstantin
> >>>>>>> <kdorfman@xxxxxxxxxxxxxx>; Sahitya Tummala
> >>>>>>> <stummala@xxxxxxxxxxxxxx>; Harjani Ritesh
> >>>>>>> <riteshh@xxxxxxxxxxxxxx>; Venu Byravarasu
> >>>>>>> <vbyravarasu@xxxxxxxxxx>; Linus Walleij
> >>>>>>> <linus.walleij@xxxxxxxxxx>
> >>>>>>> Subject: Re: [PATCH V4 09/11] mmc: block: Add CQE support
> >>>>>>>
> >>>>>>> On 09/08/17 03:55, Shawn Lin wrote:
> >>>>>>>> Hi,
> >>>>>>>>
> >>>>>>>> On 2017/8/8 20:07, Bough Chen wrote:
> >>>>>>>>>> -----Original Message-----
> >>>>>>>>>> From: Adrian Hunter [mailto:adrian.hunter@xxxxxxxxx]
> >>>>>>>>>> Sent: Friday, July 21, 2017 5:50 PM
> >>>>>>>>>> To: Ulf Hansson <ulf.hansson@xxxxxxxxxx>
> >>>>>>>>>> Cc: linux-mmc <linux-mmc@xxxxxxxxxxxxxxx>; Bough Chen
> >>>>>>>>>> <haibo.chen@xxxxxxx>; Alex Lemberg
> >>> <alex.lemberg@xxxxxxxxxxx>;
> >>>>>>>>>> Mateusz Nowak <mateusz.nowak@xxxxxxxxx>; Yuliy Izrailov
> >>>>>>>>>> <Yuliy.Izrailov@xxxxxxxxxxx>; Jaehoon Chung
> >>>>>>>>>> <jh80.chung@xxxxxxxxxxx>; Dong Aisheng
> >>> <dongas86@xxxxxxxxx>;
> >>>> Das
> >>>>>>>>>> Asutosh <asutoshd@xxxxxxxxxxxxxx>; Zhangfei Gao
> >>>>>>>>>> <zhangfei.gao@xxxxxxxxx>; Dorfman Konstantin
> >>>>>>>>>> <kdorfman@xxxxxxxxxxxxxx>; David Griego
> >>>>>>>>>> <david.griego@xxxxxxxxxx>; Sahitya Tummala
> >>>>>>>>>> <stummala@xxxxxxxxxxxxxx>; Harjani Ritesh
> >>>>>>>>>> <riteshh@xxxxxxxxxxxxxx>; Venu Byravarasu
> >>>>>>>>>> <vbyravarasu@xxxxxxxxxx>; Linus Walleij
> >>>>>>>>>> <linus.walleij@xxxxxxxxxx>; Shawn Lin
> >>>>>>>>>> <shawn.lin@xxxxxxxxxxxxxx>
> >>>>>>>>>> Subject: [PATCH V4 09/11] mmc: block: Add CQE support
> >>>>>>>>>>
> >>>>>>>>>> Add CQE support to the block driver, including:
> >>>>>>>>>>     - optionally using DCMD for flush requests
> >>>>>>>>>>     - manually issuing discard requests
> >>>>>>>>>>     - issuing read / write requests to the CQE
> >>>>>>>>>>     - supporting block-layer timeouts
> >>>>>>>>>>     - handling recovery
> >>>>>>>>>>     - supporting re-tuning
> >>>>>>>>>>
> >>>>>>>>>> Signed-off-by: Adrian Hunter <adrian.hunter@xxxxxxxxx>
> >>>>>>>>>> ---
> >>>>>>>>>>   drivers/mmc/core/block.c | 195
> >>>>>>> ++++++++++++++++++++++++++++++++-
> >>>>>>>>>>   drivers/mmc/core/block.h |   7 ++
> >>>>>>>>>>   drivers/mmc/core/queue.c | 273
> >>>>>>>>>> ++++++++++++++++++++++++++++++++++++++++++++++-
> >>>>>>>>>>   drivers/mmc/core/queue.h |  42 +++++++-
> >>>>>>>>>>   4 files changed, 510 insertions(+), 7 deletions(-)
> >>>>>>>>>>
> >>>>>>>>>> diff --git a/drivers/mmc/core/block.c
> >>>>>>>>>> b/drivers/mmc/core/block.c index
> >>>>>>>>>> 915290c74363..2d25115637b7 100644
> >>>>>>>>>> --- a/drivers/mmc/core/block.c
> >>>>>>>>>> +++ b/drivers/mmc/core/block.c
> >>>>>>>>>> @@ -109,6 +109,7 @@ struct mmc_blk_data {
> >>>>>>>>>>   #define MMC_BLK_WRITE        BIT(1)
> >>>>>>>>>>   #define MMC_BLK_DISCARD        BIT(2)
> >>>>>>>>>>   #define MMC_BLK_SECDISCARD    BIT(3)
> >>>>>>>>>> +#define MMC_BLK_CQE_RECOVERY    BIT(4)
> >>>>>>>>>>
> >>>>>>>>>>       /*
> >>>>>>>>>>        * Only set in main mmc_blk_data associated @@ -1612,6
> >>>>>>>>>> +1613,198 @@ static void mmc_blk_data_prep(struct
> mmc_queue
> >>>> *mq,
> >>>>>>>>>> struct mmc_queue_req *mqrq,
> >>>>>>>>>>           *do_data_tag_p = do_data_tag;
> >>>>>>>>>>   }
> >>>>>>>>>>
> >>>>>>>>>> +#define MMC_CQE_RETRIES 2
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>>> +        blk_queue_rq_timed_out(mq->queue,
> mmc_cqe_timed_out);
> >>>>>>>>>> +        blk_queue_rq_timeout(mq->queue, 60 * HZ);
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>> ------8<-------
> >>>>>>>>
> >>>>>>>>> Hi Adrian,
> >>>>>>>>>
> >>>>>>>>> These days I'm doing CMDQ stress test, and find one issue.
> >>>>>>>>> On our i.MX8QXP-ARM2 board, the RAM is 3GB. eMMC is 32GB.
> >>>>>>>>> I use command 'free -m' get the total memory is 2800M, and the
> >>>>>>>>> free memory is 2500M.
> >>>>>>>>>
> >>>>>>>>> I use 'mkfs.ext4' to format ext4 file system on the eMMC under
> >>>>>>>>> HS400ES CMDQ mode, works fine.
> >>>>>>>>>
> >>>>>>>>> When I use the following command to stress test CMDQ, it works
> fine.
> >>>>>>>>> bonnie++ -d /run/media/mmcblk0p1/ -u 0:0 -s 2048 -r 1024
> >>>>>>>>>
> >>>>>>>>> But when I change to use a large file size to do the same
> >>>>>>>>> stress test, using
> >>>>>>>>> bonnie++ -d /run/media/mmcblk0p1/ -u 0:0 -s 4096 -r 2048
> >>>>>>>>> or
> >>>>>>>>> bonnie++ -d /run/media/mmcblk0p1/ -u 0:0 -s 5600
> >>>>>>>>>
> >>>>>>>>> I get the following dump message.  According to the log,
> >>>>>>>>> mmc_cqe_timed_out() was trigged.
> >>>>>>>>> Seems mmc was blocked in somewhere.
> >>>>>>>>> Then I try to debug this issue, and open MMC_DEBUG in config,
> >>>>>>>>> do the same test, print the detail Command sending information
> >>>>>>>>> on the console, but finally can't reproduce.
> >>>>>>>
> >>>>>>> mmc_cqe_timed_out() is a 60 second timeout provided by the block
> >>> layer.
> >>>>>>> Refer "blk_queue_rq_timeout(mq->queue, 60 * HZ)" in
> >>>> mmc_init_queue().
> >>>>>>> 60s is quite a long time so I would first want to determine if
> >>>>>>> the task was really queued that long.  I would instrument some
> >>>>>>> code into
> >>>>>>> cqhci_request() to record the start time on struct mmc_request,
> >>>>>>> and then print the time taken when there is a problem.
> >>>>>>>
> >>>>>>
> >>>>>> Hi Adrian,
> >>>>>>
> >>>>>> According to your suggestion, I add the following code to print the time.
> >>>>>> When issue happens, seems the request really pending for over 60s!
> >>>>>>
> >>>>>> done
> >>>>>> Writing intelligently...[  689.209548] mmc0: cqhci: timeout for
> >>>>>> tag
> >>>>>> 9 [  689.213658] the mrq all use 62123742 us [  689.217487] mmc0:
> >>>>>> cqhci: ============ CQHCI REGISTER DUMP ===========
> >>>>>> [  689.223927] mmc0: cqhci: Caps:      0x0000310a | Version:  0x00000510
> >>>>>> [  689.230363] mmc0: cqhci: Config:    0x00001001 | Control:  0x00000000
> >>>>>> [  689.236800] mmc0: cqhci: Int stat:  0x00000000 | Int enab: 0x00000006
> >>>>>> [  689.243238] mmc0: cqhci: Int sig:   0x00000006 | Int Coal: 0x00000000
> >>>>>> [  689.249675] mmc0: cqhci: TDL base:  0x90079000 | TDL up32:
> 0x00000000
> >>>>>> [  689.256113] mmc0: cqhci: Doorbell:  0x1fffffff | TCN:      0x00000000
> >>>>>> [  689.262550] mmc0: cqhci: Dev queue: 0x1fffefff | Dev Pend:
> 0x1fff7fff
> >>>>>> [  689.268988] mmc0: cqhci: Task clr:  0x00000000 | SSC1:     0x00011000
> >>>>>> [  689.275425] mmc0: cqhci: SSC2:      0x00000001 | DCMD rsp:
> 0x00000800
> >>>>>> [  689.281862] mmc0: cqhci: RED mask:  0xfdf9a080 | TERRI:
> 0x00000000
> >>>>>> [  689.288300] mmc0: cqhci: Resp idx:  0x0000002f | Resp arg:
> >>>>>> 0x00000900 [  689.294737] mmc0: sdhci: ============ SDHCI
> >>>>>> REGISTER DUMP =========== [  689.301176] mmc0: sdhci: Sys addr:
> >>>>>> 0xb602f000
> >>>>>> |
> >>>>>> Version:  0x00000002 [  689.307612] mmc0: sdhci: Blk size:
> >>>>>> 0x00000200 | Blk cnt:  0x00000400 [  689.314050] mmc0: sdhci:
> Argument:
> >>>> 0x000f0400 | Trn mode: 0x00000023
> >>>>>> [  689.320487] mmc0: sdhci: Present:   0x01fd858f | Host ctl: 0x00000030
> >>>>>> [  689.326925] mmc0: sdhci: Power:     0x00000002 | Blk gap:
> 0x00000080
> >>>>>> [  689.333362] mmc0: sdhci: Wake-up:   0x00000008 | Clock:
> 0x0000000f
> >>>>>> [  689.339800] mmc0: sdhci: Timeout:   0x0000008f | Int stat:
> 0x00000000
> >>>>>> [  689.346237] mmc0: sdhci: Int enab:  0x107f4000 | Sig enab:
> >>>>>> 0x107f4000 [  689.352674] mmc0: sdhci: AC12 err:  0x00000000 | Slot int:
> >>>> 0x00000502
> >>>>>> [  689.359113] mmc0: sdhci: Caps:      0x07eb0000 | Caps_1:   0x8000b407
> >>>>>> [  689.365549] mmc0: sdhci: Cmd:       0x00002c1a | Max curr: 0x00ffffff
> >>>>>> [  689.371987] mmc0: sdhci: Resp[0]:   0x00000900 | Resp[1]:  0xffffffff
> >>>>>> [  689.378424] mmc0: sdhci: Resp[2]:   0x328f5903 | Resp[3]:
> 0x00d02700
> >>>>>> [  689.384861] mmc0: sdhci: Host ctl2: 0x00000008 [  689.389302]
> >>>>>> mmc0: sdhci: ADMA Err:  0x00000009 | ADMA Ptr: 0x9009a400 [
> >>>>>> 689.395737] mmc0: sdhci:
> >>>> ============================================
> >>>>>> [  689.402212] mmc0: running CQE recovery
> >>>>>
> >>>>> Tag 9 has been queued (bit set in Dev Pend) which means it is up
> >>>>> to the eMMC to select it for execution.  You should dump the times
> >>>>> for the other mrq's to see how long they have been waiting and try
> >>>>> to determine if anything is being processed.
> >>>>>
> >>>>> If the eMMC is just taking a really long time to process tasks we
> >>>>> could extend the timeout, but it is hard to see how that is
> >>>>> acceptable to a final product.  At this point it looks like the
> >>>>> eMMC may have a flaw in the way it selects tasks for execution.
> >>>>
> >>>> No, that is wrong sorry, the task is in the QSR (Dev queue) so it
> >>>> is the CQE that has not selected it.
> >>>
> >>> The timeout tag is 9, for Dev queue: 0x1fffefff, bit 9 is 1, means
> >>> task 9 already queue in eMMC.
> >>> For Dev Pend: 0x1fff7fff, the bit 9 is also 1,  which means CQE
> >>> already send
> >>> CMD44 and CMD45, but still not send CMD46/47.  Seems our CQE pending
> >>> tag 9 for over 60s! I will check with our IC guys to confirm the hardware
> mechanism.
> >>>
> >>
> >> For the eMMC chip, the sequential wirte speed test by 'dd' is around
> 100MB/s.
> >> If each tag try to write 1GB data, which meas each tag needs 10s to
> >> complete, once The number of pending tags exceed 6, 60s timeout will be
> trigged.
> >
> > The request size is limited by the block layer due to host controller
> > parameters.  In the case of SDHCI to 512KiB.  So each tag is at most 512KiB.
> >
> 
> I just found a bug in 32-bit DMA.  Are you using 32-bit DMA?  That could also be
> causing your problem.  I will send a new version of the patches with a fix,
> probably later today.

Yes, I'm using 32-bit ADMA.

��.n��������+%������w��{.n�����{��i��)��jg��������ݢj����G�������j:+v���w�m������w�������h�����٥




[Index of Archives]     [Linux USB Devel]     [Linux Media]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux