Hi Miguel On Mon, Jul 29, 2019 at 2:47 PM Miquel Raynal <miquel.raynal@xxxxxxxxxxx> wrote: > > Hi Greg, > > + Boris > > Greg Ungerer <gerg@xxxxxxxxxx> wrote on Mon, 29 Jul 2019 22:33:56 +1000: > > > Hi Miquel, > > > > On 29/7/19 6:36 pm, Miquel Raynal wrote: > > > Hi Greg, > > > > > > One question below. > > > > > > +Michael > > > +Sascha > > > > > > Hello Michael, here is a similar issue to yours, I know you did not > > > have enough time to share your solution but here we have someone else > > > reproducing the issue, would you mind sharing a branch or a patch, even > > > a WIP one, just to help debugging? > > > > > > Greg Ungerer <gerg@xxxxxxxxxx> wrote on Mon, 29 Jul 2019 16:41:51 +1000: > > > > > >> Hi Miquel, > > >> > > >> I am experiencing a problem with NAND flash DMA timeouts on > > >> iMX6ull based boards. The problem is very similar to that > > >> described in: > > >> > > >> https://linux-mtd.infradead.narkive.com/JIUulfFB/gpmi-imx6ull-timeout-on-dma > > >> > > >> That didn't come to any specific resolution that I could see > > >> in that thread. > > >> > > >> The boot trace on the console for me looks like this: > > >> > > >> nand: device found, Manufacturer ID: 0x2c, Chip ID: 0xda > > >> nand: Micron MT29F2G08ABAEAWP > > >> nand: 256 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 64 > > >> gpmi-nand 1806000.gpmi-nand: DMA timeout, last DMA > > >> gpmi-nand 1806000.gpmi-nand: Show GPMI registers : > > >> gpmi-nand 1806000.gpmi-nand: offset 0x000 : 0x20830002 > > >> gpmi-nand 1806000.gpmi-nand: offset 0x010 : 0x00000000 > > >> gpmi-nand 1806000.gpmi-nand: offset 0x020 : 0x00000000 > > >> gpmi-nand 1806000.gpmi-nand: offset 0x030 : 0x00000000 > > >> gpmi-nand 1806000.gpmi-nand: offset 0x040 : 0x00000000 > > >> gpmi-nand 1806000.gpmi-nand: offset 0x050 : 0x00000000 > > >> gpmi-nand 1806000.gpmi-nand: offset 0x060 : 0x01c6800c > > >> gpmi-nand 1806000.gpmi-nand: offset 0x070 : 0x00010101 > > >> gpmi-nand 1806000.gpmi-nand: offset 0x080 : 0xe0000000 > > >> gpmi-nand 1806000.gpmi-nand: offset 0x090 : 0x23023336 > > >> gpmi-nand 1806000.gpmi-nand: offset 0x0a0 : 0x000001ee > > >> gpmi-nand 1806000.gpmi-nand: offset 0x0b0 : 0xff000001 > > >> gpmi-nand 1806000.gpmi-nand: offset 0x0c0 : 0x00000001 > > >> gpmi-nand 1806000.gpmi-nand: offset 0x0d0 : 0x05020000 > > >> gpmi-nand 1806000.gpmi-nand: Show BCH registers : > > >> gpmi-nand 1806000.gpmi-nand: offset 0x000 : 0x00000100 > > >> gpmi-nand 1806000.gpmi-nand: offset 0x010 : 0x00000010 > > >> gpmi-nand 1806000.gpmi-nand: offset 0x020 : 0x00000000 > > >> gpmi-nand 1806000.gpmi-nand: offset 0x030 : 0x00000000 > > >> gpmi-nand 1806000.gpmi-nand: offset 0x040 : 0x00000000 > > >> gpmi-nand 1806000.gpmi-nand: offset 0x050 : 0x00000000 > > >> gpmi-nand 1806000.gpmi-nand: offset 0x060 : 0x00000000 > > >> gpmi-nand 1806000.gpmi-nand: offset 0x070 : 0x00000000 > > >> gpmi-nand 1806000.gpmi-nand: offset 0x080 : 0x030a2080 > > >> gpmi-nand 1806000.gpmi-nand: offset 0x090 : 0x083e2080 > > >> gpmi-nand 1806000.gpmi-nand: offset 0x0a0 : 0x070a4080 > > >> gpmi-nand 1806000.gpmi-nand: offset 0x0b0 : 0x10da4080 > > >> gpmi-nand 1806000.gpmi-nand: offset 0x0c0 : 0x070a4080 > > >> gpmi-nand 1806000.gpmi-nand: offset 0x0d0 : 0x10da4080 > > >> gpmi-nand 1806000.gpmi-nand: offset 0x0e0 : 0x070a4080 > > >> gpmi-nand 1806000.gpmi-nand: offset 0x0f0 : 0x10da4080 > > >> gpmi-nand 1806000.gpmi-nand: offset 0x100 : 0x00000000 > > >> gpmi-nand 1806000.gpmi-nand: offset 0x110 : 0x00000000 > > >> gpmi-nand 1806000.gpmi-nand: offset 0x120 : 0x00000000 > > >> gpmi-nand 1806000.gpmi-nand: offset 0x130 : 0x00000000 > > >> gpmi-nand 1806000.gpmi-nand: offset 0x140 : 0x00000000 > > >> gpmi-nand 1806000.gpmi-nand: offset 0x150 : 0x20484342 > > >> gpmi-nand 1806000.gpmi-nand: offset 0x160 : 0x01000000 > > >> gpmi-nand 1806000.gpmi-nand: offset 0x170 : 0x00000000 > > >> gpmi-nand 1806000.gpmi-nand: BCH Geometry : > > >> GF length : 13 > > >> ECC Strength : 8 > > >> Page Size in Bytes : 2110 > > >> Metadata Size in Bytes : 10 > > >> ECC Chunk0 Size in Bytes: 512 > > >> ECC Chunkn Size in Bytes: 512 > > >> ECC Chunk Count : 4 > > >> Payload Size in Bytes : 2048 > > >> Auxiliary Size in Bytes: 16 > > >> Auxiliary Status Offset: 12 > > >> Block Mark Byte Offset : 1999 > > >> Block Mark Bit Offset : 0 > > >> gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -110 > > >> nand: timing mode 5 not acknowledged by the NAND chip > > > > > > What is the final timing mode used? Most of us tested in mode 5 I > > > guess, maybe mode 4 is broken (don't know if this is the one used here, > > > neither why mode 5 is refused). Can you please try by limiting the mode > > > to 0, 1, 2... until, hopefully, we narrow down to the failing mode. > > > > Sure, how to do that? > > This loop [1] tries to configure each mode (5, 4, ...) until one > succeeds (default is 0: must always work). Please try to limit mode to > 0, 1, etc. > > Mode 0 should work. > This is not correct. When all the mode fail it fallback to 0 that does not work. Already check So the fallback is created for this situation > [1] https://elixir.bootlin.com/linux/v5.3-rc1/source/drivers/mtd/nand/raw/nand_base.c#L933 > > > > > > > >> gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22 > > >> Scanning device for bad blocks > > >> gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22 > > >> gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22 > > >> gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22 > > >> gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22 > > >> .... > > >> gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22 > > >> gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22 > > >> gpmi-nand 1806000.gpmi-nand: Chip: 0, Error -22 > > >> 5 fixed-partitions partitions found on MTD device gpmi-nand > > >> Creating 5 MTD partitions on "gpmi-nand": > > >> 0x000000000000-0x000000500000 : "u-boot" > > >> 0x000000500000-0x000000600000 : "u-boot-env" > > >> 0x000000600000-0x000000800000 : "log" > > >> 0x000000800000-0x000010000000 : "flash" > > >> 0x000000000000-0x000010000000 : "all" > > >> gpmi-nand 1806000.gpmi-nand: driver registered. > > >> > > >> > > >> This is using a linux kernel v5.1.14. I have seen this happen on > > >> a number of boards I have here - but it is only occasional. It > > >> only happens once in a while on boot, maybe 1 in 40 or more times. > > >> So it can take quite a while to reproduce (using a boot loop setup). > > > > > > That's strange... I don't get what would produce such unstable issue. > > > > My initial guess is that the calculated timing is very marginal. > > What do you mean by "marginal"? > I don't think that is timing calculation. I have tried to use the same timing as before but when those are applide. Is it possible? Michael > > The problem seems more likely to happen if flash write activity > > had been occurring just before a soft reboot. Its not a guarantee, > > just more likely. > > That's really disturbing. I doubt this is the real cause though. > > > > > Interesting observation is that Michael was using Micron flash, > > and boards that I have with the problem also have Micron flash. > > Both a form of Micron MT29F2G08. > > > > I have similar boards, iMX6ull based, with different brands of > > NAND flash and I have not seen any problem on them. > > That's great to narrow down the root cause. Maybe these chips have > tighter timing constraints. > > > > > Regards > > Greg > > > > > > > > >> As per the email thread I pointed to above I looked at reverting > > >> those patches, but that was not at all easy given how much the gpmi > > >> driver code had moved. So instead I modified the code with this: > > >> > > >> --- a/linux/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c > > >> +++ b/linux/drivers/mtd/nand/raw/gpmi-nand/gpmi-lib.c > > >> @@ -481,6 +481,7 @@ static void gpmi_nfc_compute_timings(struct gpmi_nand_data *this, > > >> void gpmi_nfc_apply_timings(struct gpmi_nand_data *this) > > >> { > > >> +#if 0 > > >> struct gpmi_nfc_hardware_timing *hw = &this->hw; > > >> struct resources *r = &this->resources; > > >> void __iomem *gpmi_regs = r->gpmi_regs; > > >> @@ -505,6 +512,7 @@ void gpmi_nfc_apply_timings(struct gpmi_nand_data *this) > > >> /* Wait for the DLL to settle. */ > > >> udelay(dll_wait_time_us); > > >> +#endif > > >> } > > >> int gpmi_setup_data_interface(struct nand_chip *chip, int chipnr, > > >> > > >> So far after a couple of days of testing with this I no longer > > >> see the DMA timeout. > > >> > > >> Any thoughts? > > >> > > >> Regards > > >> Greg > > >> > > > > > > Thanks, > > > Miquèl > > > > > Thanks, > Miquèl -- | Michael Nazzareno Trimarchi Amarula Solutions BV | | COO - Founder Cruquiuskade 47 | | +31(0)851119172 Amsterdam 1018 AM NL | | [`as] http://www.amarulasolutions.com | ______________________________________________________ Linux MTD discussion mailing list http://lists.infradead.org/mailman/listinfo/linux-mtd/