Re: s390x: kernel BUG at fs/ext4/inode.c:1591! (powerpc too!)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 2 Apr 2013 12:07:37 -0700 (PDT), Christian Kujau <lists@xxxxxxxxxxxxxxx> wrote:
> On Tue, 2 Apr 2013 at 20:33, Zheng Liu wrote:
> > Could you please revert your tree to this commit (3a225670), and try
> > again. I want to make sure that the regression won't be fixed until now
> > or it is introduced after this commit.
> 
> I have git-revert'ed this commit and the same BUG_ON was triggered again. 
> I could not bring "fsstress" to trigger this but resuming this 4.3 GB 
> Fedora DVD image via bittorrent made the machine crash after a couple of 
> minutes.
> 
> Sadly the only message netconsole is able to catch is this single line 
> from the subject above, but I'll try to apply the proposed patches[0] and 
> see if it helps anything.
Ok if netconsole can't log in case of BUG_ON then we just skip panic :)
Please use following patch instead of enable_ES_AGGRESSIVE_TEST.diff
>From e802d032225a74156f8256467aa64535369ae45c Mon Sep 17 00:00:00 2001
From: Dmitry Monakhov <dmonakhov@xxxxxxxxxx>
Date: Tue, 2 Apr 2013 23:33:16 +0400
Subject: [PATCH] enable ES_AGGRESSIVE_TEST V2


Signed-off-by: Dmitry Monakhov <dmonakhov@xxxxxxxxxx>
---
 fs/ext4/extents_status.h |    2 +-
 fs/ext4/inode.c          |   17 +++++++++++++++--
 2 files changed, 16 insertions(+), 3 deletions(-)

diff --git a/fs/ext4/extents_status.h b/fs/ext4/extents_status.h
index d8e2d4d..70233a6 100644
--- a/fs/ext4/extents_status.h
+++ b/fs/ext4/extents_status.h
@@ -24,7 +24,7 @@
  * With ES_AGGRESSIVE_TEST defined, the result of es caching will be
  * checked with old map_block's result.
  */
-#define ES_AGGRESSIVE_TEST__
+#define ES_AGGRESSIVE_TEST
 
 /*
  * These flags live in the high bits of extent_status.es_pblk
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 840a23e..7712aff 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -1546,7 +1546,18 @@ static int mpage_da_submit_io(struct mpage_da_data *mpd,
 					}
 					if (buffer_unwritten(bh) ||
 					    buffer_mapped(bh))
-						BUG_ON(bh->b_blocknr != pblock);
+						if (bh->b_blocknr != pblock) {
+							printk(KERN_ERR "mpage_da_submit_io failed"
+							       " block=%llu != b_blocknr=%llu\n",
+							       (unsigned long long)pblock,
+							       (unsigned long long)bh->b_blocknr);
+							printk(KERN_ERR "ino:%ld lbkl:%lu, "
+							       "b_state=0x%08lx, b_size=%zu\n",
+							       inode->i_ino, cur_logical,
+							       bh->b_state, bh->b_size);
+							WARN_ON(1);
+							goto skip_page;
+						}
 					if (map->m_flags & EXT4_MAP_UNINIT)
 						set_buffer_uninit(bh);
 					clear_buffer_unwritten(bh);
@@ -1556,8 +1567,10 @@ static int mpage_da_submit_io(struct mpage_da_data *mpd,
 				 * skip page if block allocation undone and
 				 * block is dirty
 				 */
-				if (ext4_bh_delay_or_unwritten(NULL, bh))
+				if (ext4_bh_delay_or_unwritten(NULL, bh)) {
+				skip_page:
 					skip_page = 1;
+				}
 				bh = bh->b_this_page;
 				block_start += bh->b_size;
 				cur_logical++;
-- 
1.7.1

So once you hit the bug it will print a lot of warnings and try to
pretend what nothing is happens.

So my predictions is follows:
1) with enable_ES_AGGRESSIVE_TEST-V2.diff patch you will see a lot of
warnings

2) with enable_ES_AGGRESSIVE_TEST-V2.diff and
   http://nerdbynature.de/bits/3.9.0-rc4/ext4/disable-es_lookup_extent.patch
 
   Issue probably will go away (will be hidden)

> 
> Thanks,
> Christian.
> 
> [0] http://nerdbynature.de/bits/3.9.0-rc4/ext4/
> -- 
> BOFH excuse #344:
> 
> Network failure -  call NBC

[Index of Archives]     [Reiser Filesystem Development]     [Ceph FS]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite National Park]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Device Mapper]     [Linux Media]

  Powered by Linux