Re: [PATCH v2] mtd: rawnand: marvell: check for RDY bits after enabling the IRQ

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 27 Sep 2018 21:55:57 +0000
Chris Packham <Chris.Packham@xxxxxxxxxxxxxxxxxxx> wrote:

> Hi All,
> 
> On 27/09/18 20:56, Boris Brezillon wrote:
> > On Thu, 27 Sep 2018 10:11:45 +0200
> > Miquel Raynal <miquel.raynal@xxxxxxxxxxx> wrote:
> >   
> >> Hi Daniel,
> >>
> >> Daniel Mack <daniel@xxxxxxxxxx> wrote on Thu, 27 Sep 2018 09:17:51
> >> +0200:
> >>  
> >>> At least on PXA3xx platforms, enabling RDY interrupts in the NDCR register
> >>> will only cause the IRQ to latch when the RDY lanes are changing, and not
> >>> in case they are already asserted.
> >>>
> >>> This means that if the controller finished the command in flight before
> >>> marvell_nfc_wait_op() is called, that function will wait for a change in
> >>> the bit that can't ever happen as it is already set.
> >>>
> >>> To address this race, check for the RDY bits after the IRQ was enabled,
> >>> and complete the completion immediately if the condition is already met.
> >>>
> >>> This fixes a bug that was observed with a NAND chip that holds a UBIFS
> >>> parition on which file system stress tests were executed. When
> >>> marvell_nfc_wait_op() reports an error, UBI/UBIFS will eventually mount
> >>> the filesystem read-only, reporting lots of warnings along the way.
> >>>
> >>> Fixes: 02f26ecf8c77 mtd: nand: add reworked Marvell NAND controller driver
> >>> Cc: stable@xxxxxxxxxxxxxxx
> >>> Signed-off-by: Daniel Mack <daniel@xxxxxxxxxx>
> >>> ---  
> >>
> >> Sorry I haven't had the time to check on my Armada, but you figured it
> >> out, and the fix looks good to me!
> >>
> >> Acked-by: Miquel Raynal <miquel.raynal@xxxxxxxxxxx>
> >>
> >> Boris, do you plan to send another fixes PR of can I take it into
> >> the nand/next branch?  
> > 
> > Queued to mtd/master.  
> 
> After fixing my R/B configuration I get a new error with this patch when 
> running stress_1 from mtd-utils-2.0.0. I don't see this without the patch.
> 
> My board is a custom design using an Armada-385 SoC with Macronix NAND.
> 
> # stress_1
> marvell-nfc f10d0000.nand-controller: Timeout on RDDREQ/WRDREQ while 
> draining raw data (NDSR: 0x00000000)
> ubi0 error: ubi_io_write: error -5 while writing 4096 bytes to PEB 
> 1858:110592, written 0 bytes
> CPU: 1 PID: 1170 Comm: stress_1 Not tainted 4.19.0-rc5-at1+ #8
> Hardware name: Marvell Armada 380/385 (Device Tree)
> [<8011143c>] (unwind_backtrace) from [<8010c17c>] (show_stack+0x10/0x14)
> [<8010c17c>] (show_stack) from [<805ec28c>] (dump_stack+0x88/0x9c)
> [<805ec28c>] (dump_stack) from [<80418a28>] (ubi_io_write+0x55c/0x6c0)
> [<80418a28>] (ubi_io_write) from [<80415b4c>] (ubi_eba_write_leb+0x80/0x780)
> [<80415b4c>] (ubi_eba_write_leb) from [<80414580>] (ubi_leb_write+0xbc/0xe0)
> [<80414580>] (ubi_leb_write) from [<802d46b4>] (ubifs_leb_write+0xa0/0x118)
> [<802d46b4>] (ubifs_leb_write) from [<802d5620>] 
> (ubifs_wbuf_write_nolock+0x184/0x6ac)
> [<802d5620>] (ubifs_wbuf_write_nolock) from [<802c8a18>] 
> (ubifs_jnl_write_data+0x1c0/0x2bc)
> [<802c8a18>] (ubifs_jnl_write_data) from [<802caed8>] 
> (do_writepage+0xa4/0x1b0)
> [<802caed8>] (do_writepage) from [<801aa160>] (__writepage+0x14/0x48)
> [<801aa160>] (__writepage) from [<801aa900>] (write_cache_pages+0x1d0/0x3e4)
> [<801aa900>] (write_cache_pages) from [<801aab68>] 
> (generic_writepages+0x54/0x80)
> [<801aab68>] (generic_writepages) from [<801ac9a0>] 
> (do_writepages+0x68/0x8c)
> [<801ac9a0>] (do_writepages) from [<801a0ac8>] 
> (__filemap_fdatawrite_range+0x88/0xc0)
> [<801a0ac8>] (__filemap_fdatawrite_range) from [<801a0cc4>] 
> (file_write_and_wait_range+0x3c/0x98)
> [<801a0cc4>] (file_write_and_wait_range) from [<802cb600>] 
> (ubifs_fsync+0x3c/0xb0)
> [<802cb600>] (ubifs_fsync) from [<801a2828>] 
> (generic_file_write_iter+0x198/0x24c)
> [<801a2828>] (generic_file_write_iter) from [<802ccb84>] 
> (ubifs_write_iter+0xf0/0x158)
> [<802ccb84>] (ubifs_write_iter) from [<801ef854>] (__vfs_write+0xfc/0x160)
> [<801ef854>] (__vfs_write) from [<801efa60>] (vfs_write+0xa4/0x1ac)
> [<801efa60>] (vfs_write) from [<801efcac>] (ksys_write+0x54/0xb8)
> [<801efcac>] (ksys_write) from [<80101000>] (ret_fast_syscall+0x0/0x54)
> Exception stack(0xbd789fa8 to 0xbd789ff0)
> 9fa0:                   0ca5d000 00000000 00000003 7e9f2900 00008000 
> ffffffff
> 9fc0: 0ca5d000 00000000 00008000 00000004 00000003 00000000 76f24fb4 
> 00000000
> 9fe0: 00000000 7e9f27fc 00010fd8 76e775ec
> marvell-nfc f10d0000.nand-controller: Timeout on RDDREQ while draining 
> FIFO (data) (NDSR: 0x00000810)
> ttyS ttyS1: tty_port_close_start: tty->count = 1 port count = 2
> marvell-nfc f10d0000.nand-controller: Timeout on RDDREQ while draining 
> FIFO (data) (NDSR: 0x00000810)
> marvell-nfc f10d0000.nand-controller: Timeout on RDDREQ while draining 
> FIFO (data) (NDSR: 0x00000810)
> marvell-nfc f10d0000.nand-controller: Timeout on RDDREQ while draining 
> FIFO (data) (NDSR: 0x00000810)
> marvell-nfc f10d0000.nand-controller: Timeout on RDDREQ while draining 
> FIFO (data) (NDSR: 0x00000810)
> marvell-nfc f10d0000.nand-controller: Timeout on RDDREQ while draining 
> FIFO (data) (NDSR: 0x00000810)
> marvell-nfc f10d0000.nand-controller: Timeout on RDDREQ while draining 
> FIFO (data) (NDSR: 0x00000810)
> 
> ... (RDDREQ messages repeat).

Hm, that's weird, unless RDDREQ is a 'clear-on-read' bit, that
shouldn't happen.




[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux