That's all I currently have of the log, I'll see if I can get more of it. I was pointed to this diff that has a "goto out" added if we hit this scenario: feda58d37ae0efe22e711a74e26fb541d4eb1baa fs/ext3/balloc.c | 8 ++++++-- 1 files changed, 6 insertions(+), 2 deletions(-) diff --git a/fs/ext3/balloc.c b/fs/ext3/balloc.c index a26e683..d2dface 100644 --- a/fs/ext3/balloc.c +++ b/fs/ext3/balloc.c @@ -530,11 +530,13 @@ do_more: in_range (block, le32_to_cpu(desc->bg_inode_table), sbi->s_itb_per_group) || in_range (block + count - 1, le32_to_cpu(desc->bg_inode_table), - sbi->s_itb_per_group)) + sbi->s_itb_per_group)) { ext3_error (sb, "ext3_free_blocks", "Freeing blocks in system zones - " "Block = "E3FSBLK", count = %lu", block, count); + goto error_return; + } /* * We are about to start releasing blocks in the bitmap, @@ -1637,11 +1639,13 @@ allocated: in_range(ret_block, le32_to_cpu(gdp->bg_inode_table), EXT3_SB(sb)->s_itb_per_group) || in_range(ret_block + num - 1, le32_to_cpu(gdp->bg_inode_table), - EXT3_SB(sb)->s_itb_per_group)) + EXT3_SB(sb)->s_itb_per_group)) { ext3_error(sb, "ext3_new_block", "Allocating block in system zone - " "blocks from "E3FSBLK", length %lu", ret_block, num); + goto out; + } performed_allocation = 1; The *errp gets set to -ENOSPC at the beginning of the routine so it'd go to out with *errp set to -ENOSPC. That's got to be better then continuing thinking it's OK to scribble on metadata. However, I don't think it's necessarily true that it's out of space. Before this is a loop that looks for a group that will satisfy the request. If it thinks it's found one it stops checking the groups and jumps out of the loop and then makes this test for overlapping the metadata. With the new patch it returns -ENOSPC, but really there just isn't any space up to the groups that have been checked and we get to the test before all the groups have been checked. It would seem that this check would be better serviced inside the loop once we think we have a possible range of blocks that will fit. If it turns out the range overlaps the metadata we should try the next group. If we exhaust the groups then we drop out of the loop and return -ENOSPC. So, moving the "in_range" checks into the loop and use it as a condition on jumping to allocated seems appropriate. Not sure of all the details that ext3_try_to_allocate_with_rsv() might do, maybe something has to be "undone" if the in_range tests fail. But, something like this in the loop for (bgi = 0; bgi < ngroups; bgi++) { group_no++; if (group_no >= ngroups) group_no = 0; gdp = ext3_get_group_desc(sb, group_no, &gdp_bh); if (!gdp) goto io_error; free_blocks = le16_to_cpu(gdp->bg_free_blocks_count); /* * skip this group if the number of * free blocks is less than half of the reservation * window size. */ if (free_blocks <= (windowsz/2)) continue; brelse(bitmap_bh); bitmap_bh = read_block_bitmap(sb, group_no); if (!bitmap_bh) goto io_error; /* * try to allocate block(s) from this group, without a goal(-1). */ grp_alloc_blk = ext3_try_to_allocate_with_rsv(sb, handle, group_no, bitmap_bh, -1, my_rsv, &num, &fatal); if (fatal) goto out; if (grp_alloc_blk >= 0) ret_block = grp_alloc_blk + ext3_group_first_block_no(sb, group_no); if !(in_range(le32_to_cpu(gdp->bg_block_bitmap), ret_block, num) || in_range(le32_to_cpu(gdp->bg_inode_bitmap), ret_block, num) || in_range(ret_block, le32_to_cpu(gdp->bg_inode_table), EXT3_SB(sb)->s_itb_per_group) || in_range(ret_block + num - 1, le32_to_cpu(gdp->bg_inode_table), EXT3_SB(sb)->s_itb_per_group)) goto allocated; } Then we don't need a check outside of the loop. If we don;t find a group that has enough space that doesn't overlap any metadata we'll just drop out of the loop and return -ENOSPC. But, maybe I am missing something else here that makes performing the check outside the loop appropriate. Thanks Chris Kottaridis (chriskot@xxxxxxxxxxxxx) ---------------------------------------------------------------------- On Thu, 2008-02-28 at 14:22 -0800, Mingming Cao wrote: > On Thu, 2008-02-28 at 11:45 -0700, Chris Kottaridis wrote: > > I have a 2.6.21 kernel and under heavy load I seem to be trapsing > > through this code in > > > > fs/ext3/balloc.c ext3_new_blocks(): > > > > if (in_range(le32_to_cpu(gdp->bg_block_bitmap), ret_block, num) || > > in_range(le32_to_cpu(gdp->bg_inode_bitmap), ret_block, num) || > > in_range(ret_block, le32_to_cpu(gdp->bg_inode_table), > > EXT3_SB(sb)->s_itb_per_group) || > > in_range(ret_block + num - 1, le32_to_cpu(gdp->bg_inode_table), > > EXT3_SB(sb)->s_itb_per_group)) > > ext3_error(sb, "ext3_new_block", > > "Allocating block in system zone - " > > "blocks from "E3FSBLK", length %lu", > > ret_block, num); > > > > performed_allocation = 1; > > > > I get this error: > > > > Feb 12 13:29:24 slot0_10 kernel: EXT3-fs error (device dm-0): > > ext3_new_block: Allocating block in system zone - blocks > > from 86966272, length 5 > > Feb 12 13:29:24 slot0_10 kernel: Aborting journal on device dm-0. > > Feb 12 13:29:24 slot0_10 kernel: EXT3-fs error (device dm-0) in > > ext3_reserve_inode_write: Journal has aborted > > > > The call to ext3_error() not only prints the error message, but calls > > ext3_handle_error() which apparently puts the file system into read-only > > mode. Usually the filesystem needs to be rebuilt once this happens. > > > > I am not real sure what the "system zone" is about here or what the > > message is trying to tell us.This doesn't seem to be a out of disk space > > error as there is a -ENOSPC error check earlier that we seem to get > > past. It seems like the blocks were able to be allocated, so I am a > > little curious about why the file system gets marked as read-only. > > > > It complains that the new allocated block is located in the range that > is used to store fs metadata (superblocks, bitmaps, inode table etc). > Those fs metadata blocks should always marked as used on bitmaps, so > block allocator should never returns a new block in that range if it > does things correct. > > > Any comments appreciated. > > > > Is there any other error message print in dmesg before the fs starts to > complain? > > Mingming > > -- > To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html