Re: new_block error

Chris Kottaridis <chriskot@xxxxxxxxxxxxx> · Thu, 28 Feb 2008 17:10:10 -0700

That's all I currently have of the log, I'll see if I can get more of
it.

I was pointed to this diff that has a "goto out" added if we hit this
scenario:

feda58d37ae0efe22e711a74e26fb541d4eb1baa
 fs/ext3/balloc.c |    8 ++++++--
 1 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/fs/ext3/balloc.c b/fs/ext3/balloc.c
index a26e683..d2dface 100644
--- a/fs/ext3/balloc.c
+++ b/fs/ext3/balloc.c
@@ -530,11 +530,13 @@ do_more:
 	    in_range (block, le32_to_cpu(desc->bg_inode_table),
 		      sbi->s_itb_per_group) ||
 	    in_range (block + count - 1, le32_to_cpu(desc->bg_inode_table),
-		      sbi->s_itb_per_group))
+		      sbi->s_itb_per_group)) {
 		ext3_error (sb, "ext3_free_blocks",
 			    "Freeing blocks in system zones - "
 			    "Block = "E3FSBLK", count = %lu",
 			    block, count);
+		goto error_return;
+	}
 
 	/*
 	 * We are about to start releasing blocks in the bitmap,
@@ -1637,11 +1639,13 @@ allocated:
 	    in_range(ret_block, le32_to_cpu(gdp->bg_inode_table),
 		      EXT3_SB(sb)->s_itb_per_group) ||
 	    in_range(ret_block + num - 1, le32_to_cpu(gdp->bg_inode_table),
-		      EXT3_SB(sb)->s_itb_per_group))
+		      EXT3_SB(sb)->s_itb_per_group)) {
 		ext3_error(sb, "ext3_new_block",
 			    "Allocating block in system zone - "
 			    "blocks from "E3FSBLK", length %lu",
 			     ret_block, num);
+		goto out;
+	}
 
 	performed_allocation = 1;
 

The *errp gets set to -ENOSPC at the beginning of the routine so it'd go
to out with *errp set to -ENOSPC.

That's got to be better then continuing thinking it's OK to scribble on
metadata.

However, I don't think it's necessarily true that it's out of space.
Before this is a loop that looks for a group that will satisfy the
request. If it thinks it's found one it stops checking the groups and
jumps out of the loop and then makes this test for overlapping the
metadata. With the new patch it returns -ENOSPC, but really there just
isn't any space up to the groups that have been checked and we get to
the test before all the groups have been checked.

It would seem that this check would be better serviced inside the loop
once we think we have a possible range of blocks that will fit. If it
turns out the range overlaps the metadata we should try the next group.
If we exhaust the groups then we drop out of the loop and return
-ENOSPC. So, moving the "in_range" checks into the loop and use it as a
condition on jumping to allocated seems appropriate. Not sure of all the
details that ext3_try_to_allocate_with_rsv() might do, maybe something
has to be "undone" if the in_range tests fail. But, something like this
in the loop

       for (bgi = 0; bgi < ngroups; bgi++) {
                group_no++;
                if (group_no >= ngroups)
                        group_no = 0;
                gdp = ext3_get_group_desc(sb, group_no, &gdp_bh);
                if (!gdp)
                        goto io_error;
                free_blocks = le16_to_cpu(gdp->bg_free_blocks_count);
                /*
                 * skip this group if the number of
                 * free blocks is less than half of the reservation
                 * window size.
                 */
                if (free_blocks <= (windowsz/2))
                        continue;

                brelse(bitmap_bh);
                bitmap_bh = read_block_bitmap(sb, group_no);
                if (!bitmap_bh)
                        goto io_error;
                /*
                 * try to allocate block(s) from this group, without a
goal(-1).
                 */
                grp_alloc_blk = ext3_try_to_allocate_with_rsv(sb,
handle,
                                        group_no, bitmap_bh, -1, my_rsv,
                                        &num, &fatal);
                if (fatal)
                        goto out;
                if (grp_alloc_blk >= 0)
                        ret_block = grp_alloc_blk +
ext3_group_first_block_no(sb, group_no);
                        if !(in_range(le32_to_cpu(gdp->bg_block_bitmap),
ret_block, num) ||
                             in_range(le32_to_cpu(gdp->bg_inode_bitmap),
ret_block, num) ||
                             in_range(ret_block,
le32_to_cpu(gdp->bg_inode_table),
                                  EXT3_SB(sb)->s_itb_per_group) ||
                             in_range(ret_block + num - 1,
le32_to_cpu(gdp->bg_inode_table),
                                  EXT3_SB(sb)->s_itb_per_group))
                             goto allocated;
        }

Then we don't need a check outside of the loop. If we don;t find a group
that has enough space that doesn't overlap any metadata we'll just drop
out of the loop and return -ENOSPC.

But, maybe I am missing something else here that makes performing the
check outside the loop appropriate.

Thanks
    Chris Kottaridis    (chriskot@xxxxxxxxxxxxx)
----------------------------------------------------------------------
On Thu, 2008-02-28 at 14:22 -0800, Mingming Cao wrote:
> On Thu, 2008-02-28 at 11:45 -0700, Chris Kottaridis wrote:
> > I have a 2.6.21 kernel and under heavy load I seem to be trapsing
> > through this code in
> > 
> > fs/ext3/balloc.c ext3_new_blocks():
> > 
> >  if (in_range(le32_to_cpu(gdp->bg_block_bitmap), ret_block, num) ||
> >      in_range(le32_to_cpu(gdp->bg_inode_bitmap), ret_block, num) ||
> >      in_range(ret_block, le32_to_cpu(gdp->bg_inode_table),
> >               EXT3_SB(sb)->s_itb_per_group) ||
> >      in_range(ret_block + num - 1, le32_to_cpu(gdp->bg_inode_table),
> >               EXT3_SB(sb)->s_itb_per_group))
> >      ext3_error(sb, "ext3_new_block",
> >                "Allocating block in system zone - "
> >                "blocks from "E3FSBLK", length %lu",
> >                 ret_block, num);
> > 
> > performed_allocation = 1;
> > 
> > I get this error:
> > 
> > Feb 12 13:29:24 slot0_10 kernel: EXT3-fs error (device dm-0):
> > ext3_new_block: Allocating block in system zone - blocks
> > from 86966272, length 5
> > Feb 12 13:29:24 slot0_10 kernel: Aborting journal on device dm-0.
> > Feb 12 13:29:24 slot0_10 kernel: EXT3-fs error (device dm-0) in
> > ext3_reserve_inode_write: Journal has aborted
> > 
> > The call to ext3_error() not only prints the error message, but calls
> > ext3_handle_error() which apparently puts the file system into read-only
> > mode. Usually the filesystem needs to be rebuilt once this happens.
> > 
> > I am not real sure what the "system zone" is about here or what the
> > message is trying to tell us.This doesn't seem to be a out of disk space
> > error as there is a -ENOSPC error check earlier that we seem to get
> > past. It seems like the blocks were able to be allocated, so I am a
> > little curious about why the file system gets marked as read-only.
> > 
> 
> It complains that the new allocated block is located in the range that
> is used to store fs metadata (superblocks, bitmaps, inode table etc).
> Those fs metadata blocks should always marked as used on bitmaps, so
> block allocator should never returns a new block in that range if it
> does things correct.
> 
> > Any comments appreciated.
> > 
> 
> Is there any other error message print in dmesg before the fs starts to
> complain?
> 
> Mingming
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html