Re: GPMI iMX6ull timeout on DMA

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Miquel

sorry was difficult day ;). My answer below

On Mon, Jul 29, 2019 at 3:22 PM Miquel Raynal <miquel.raynal@xxxxxxxxxxx> wrote:
>
> Hi Michael,
>
> Michael Nazzareno Trimarchi <michael@xxxxxxxxxxxxxxxxxxxx> wrote on
> Mon, 29 Jul 2019 15:00:04 +0200:
>
> > Hi Miquel
> >
> > On Mon, Jul 29, 2019 at 2:55 PM Miquel Raynal <miquel.raynal@xxxxxxxxxxx> wrote:
> > >
> > > Hi Michael,
> > >
> > > Michael Nazzareno Trimarchi <michael@xxxxxxxxxxxxxxxxxxxx> wrote on
> > > Mon, 29 Jul 2019 14:49:19 +0200:
> > >
> > > > Hi Miguel
> > > >
> > > > On Mon, Jul 29, 2019 at 2:47 PM Miquel Raynal <miquel.raynal@xxxxxxxxxxx> wrote:
> > > > >
> > > > > Hi Greg,
> > > > >
> > > > > + Boris
> > > > >
> > > > > Greg Ungerer <gerg@xxxxxxxxxx> wrote on Mon, 29 Jul 2019 22:33:56 +1000:
> > > > >
> > > > > > Hi Miquel,
> > > > > >
> > > > > > On 29/7/19 6:36 pm, Miquel Raynal wrote:
> > > > > > > Hi Greg,
> > > > > > >
> > > > > > > One question below.
> > > > > > >
> > > > > > > +Michael
> > > > > > > +Sascha
> > > > > > >
> > > > > > > Hello Michael, here is a similar issue to yours, I know you did not
> > > > > > > have enough time to share your solution but here we have someone else
> > > > > > > reproducing the issue, would you mind sharing a branch or a patch, even
> > > > > > > a WIP one, just to help debugging?
> > > > > > >
> > > > > > > Greg Ungerer <gerg@xxxxxxxxxx> wrote on Mon, 29 Jul 2019 16:41:51 +1000:
> > > > > > >
> > > > > > >> Hi Miquel,
> > > > > > >>
> > > > > > >> I am experiencing a problem with NAND flash DMA timeouts on
> > > > > > >> iMX6ull based boards. The problem is very similar to that
> > > > > > >> described in:
> > > > > > >>
> > > > > > >>     https://linux-mtd.infradead.narkive.com/JIUulfFB/gpmi-imx6ull-timeout-on-dma
> > > > > > >>
> > > > > > >> That didn't come to any specific resolution that I could see
> > > > > > >> in that thread.
> > > > > > >>
> > > > > > >> The boot trace on the console for me looks like this:
> > > > > > >>
> > > > > > >> nand: device found, Manufacturer ID: 0x2c, Chip ID: 0xda
> > > > > > >> nand: Micron MT29F2G08ABAEAWP
> > > > > > >> nand: 256 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 64
> > > > > > >> gpmi-nand 1806000.gpmi-nand: DMA timeout, last DMA
> > > > > > >> gpmi-nand 1806000.gpmi-nand: Show GPMI registers :
> > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x000 : 0x20830002
> > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x010 : 0x00000000
> > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x020 : 0x00000000
> > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x030 : 0x00000000
> > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x040 : 0x00000000
> > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x050 : 0x00000000
> > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x060 : 0x01c6800c
> > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x070 : 0x00010101
> > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x080 : 0xe0000000
> > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x090 : 0x23023336
> > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x0a0 : 0x000001ee
> > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x0b0 : 0xff000001
> > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x0c0 : 0x00000001
> > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x0d0 : 0x05020000
> > > > > > >> gpmi-nand 1806000.gpmi-nand: Show BCH registers :
> > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x000 : 0x00000100
> > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x010 : 0x00000010
> > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x020 : 0x00000000
> > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x030 : 0x00000000
> > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x040 : 0x00000000
> > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x050 : 0x00000000
> > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x060 : 0x00000000
> > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x070 : 0x00000000
> > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x080 : 0x030a2080
> > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x090 : 0x083e2080
> > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x0a0 : 0x070a4080
> > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x0b0 : 0x10da4080
> > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x0c0 : 0x070a4080
> > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x0d0 : 0x10da4080
> > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x0e0 : 0x070a4080
> > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x0f0 : 0x10da4080
> > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x100 : 0x00000000
> > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x110 : 0x00000000
> > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x120 : 0x00000000
> > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x130 : 0x00000000
> > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x140 : 0x00000000
> > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x150 : 0x20484342
> > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x160 : 0x01000000
> > > > > > >> gpmi-nand 1806000.gpmi-nand: offset 0x170 : 0x00000000
> > > > > > >> gpmi-nand 1806000.gpmi-nand: BCH Geometry :
> > > > > > >> GF length              : 13
> > > > > > >> ECC Strength           : 8
> > > > > > >> Page Size in Bytes     : 2110
> > > > > > >> Metadata Size in Bytes : 10
> > > > > > >> ECC Chunk0 Size in Bytes: 512
> > > > > > >> ECC Chunkn Size in Bytes: 512
> > > > > > >> ECC Chunk Count        : 4
> > > > > > >> Payload Size in Bytes  : 2048
> > > > > > >> Auxiliary Size in Bytes: 16
> > > > > > >> Auxiliary Status Offset: 12
> > > > > > >> Block Mark Byte Offset : 1999
> > > > > > >> Block Mark Bit Offset  : 0
> > > > > > >> gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -110
> > > > > > >> nand: timing mode 5 not acknowledged by the NAND chip
> > > > > > >
> > > > > > > What is the final timing mode used? Most of us tested in mode 5 I
> > > > > > > guess, maybe mode 4 is broken (don't know if this is the one used here,
> > > > > > > neither why mode 5 is refused). Can you please try by limiting the mode
> > > > > > > to 0, 1, 2... until, hopefully, we narrow down to the failing mode.
> > > > > >
> > > > > > Sure, how to do that?
> > > > >
> > > > > This loop [1] tries to configure each mode (5, 4, ...) until one
> > > > > succeeds (default is 0: must always work). Please try to limit mode to
> > > > > 0, 1, etc.
> > > > >
> > > > > Mode 0 should work.
> > > > >
> > > >
> > > > This is not correct. When all the mode fail it fallback to 0 that does
> > > > not work. Already check
> > > > So the fallback is created for this situation
> > >
> > > Sorry but I don't understand what you are saying.
> > >
> >
> > I said that where a timing mode is not ackolege then the mtd stack should
> > send a reset command and fallback to timeing mode 0. The nand does not
> > respond anymore.
>
> It depends on what you define by "not acknowledged". What you describe
> is the current situation: if either the NAND controller or the NAND
> chip do not support the mode requested by the core, the core will try
> another (slower) mode until either we found one or we are at timing
> mode 0.
>
> Unfortunately, we cannot check that "all operation with these timings
> will work" at boot time, it would be very time consuming; especially
> for something that is very likely to be a controller driver issue, and
> that is what happens here: both the controller and the chip acknowledge
> the new timings.
>
> >
> > > Are you telling me that you already tried mode 0 and that it did not
> > > work better than other timings?
> > >
> >
> > I force only to use different mode but never try mode 0 ;) just
> > because should be
> > the normal fallback
>
> Unless there is a timing calculation issue in the controller driver.
> Greg, can you please find the quickest working mode (starting from 0,
> of course, to ensure mode 0 is stable).
>
> [...]
>
> > > > > > >
> > > > > > > That's strange... I don't get what would produce such unstable issue.
> > > > > >
> > > > > > My initial guess is that the calculated timing is very marginal.
> > > > >
> > > > > What do you mean by "marginal"?
> > > > >
> > > >
> > > > I don't think that is timing calculation. I have tried to use the same timing
> > > > as before but when those are applide. Is it possible?
> > >
> > >                                       ^
> > > I suppose the end of the sentence is missing?
>
> Michael, what did you mean here?
>
commit 02c786627b93b3c3286570f793294816286ff397
Author: Michael Trimarchi <michael@xxxxxxxxxxxxxxxxxxxx>
Date:   Fri Oct 5 09:46:29 2018 +0200

    Revert "mtd: rawnand: gpmi: use core timings instead of an
empirical derivation"

    This reverts commit b1206122069aadabe1a8c50789277a978aaa4df7.

    Change-Id: Icd0ddcd5e3ac7d82932bbf412299cca424cbc571
    Jira-Id: WAN-50
    Signed-off-by: Michael Trimarchi <michael@xxxxxxxxxxxxxxxxxxxx>

Revert this one does not fix the problem. Right now I have two revert
this one and

commit 6ab543c1924f77957004994bd6806a9daa45f903 (tag: MMI_004_011_R02)
Author: Michael Trimarchi <michael@xxxxxxxxxxxxxxxxxxxx>
Date:   Fri Oct 5 09:46:44 2018 +0200

    Revert "mtd: rawnand: gpmi: support ->setup_data_interface()"

    This reverts commit 76e1a0086a0c3276b384f77905345e0fcc886fdd.

    Change-Id: I60fb6f874364d1deeda3424d4508553a38ac9b1a
    Jira-Id: WAN-50
    Signed-off-by: Michael Trimarchi <michael@xxxxxxxxxxxxxxxxxxxx>

I did not have time to finish to undestand why this was fixing my problem
Michael


>
> Thanks,
> Miquèl



-- 
| Michael Nazzareno Trimarchi                     Amarula Solutions BV |
| COO  -  Founder                                      Cruquiuskade 47 |
| +31(0)851119172                                 Amsterdam 1018 AM NL |
|                  [`as] http://www.amarulasolutions.com               |

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/




[Index of Archives]     [LARTC]     [Bugtraq]     [Yosemite Forum]     [Photo]

  Powered by Linux