On Thu, 2008-03-20 at 11:16 +0300, Dmitri Monakhov wrote: > On 21:39 Wed 19 Mar , Eric Sandeen wrote: > > Solofo.Ramangalahy@xxxxxxxx wrote: > > > Hello, > > > > > > During stress testing (workload: racer from ltp + fio/iometer), here > > > is an error I am encountering: > > > 8<------------------------------------------------------------------------------ > > > kernel: WARNING: at fs/buffer.c:1680 __block_write_full_page+0xd4/0x2af() > > > > So this is WARN_ON(bh->b_size != blocksize); > > > > What is b_size in this case? > FS block size, because this page pinned bh (it comes from page_buffers(page)), but > not dummy bh which may comes from {write,read}pages or direct_IO. > Page's bh i_size must always be equal to fs blocksize. > This bh always constructed via following construction > if (!page_has_buffers(page)) > create_empty_buffers(page, 1<<inode->i_blkbits, flags) > So page's bh->b_size was inited with right value from very beginning, but > apparently somewhere this size was changed > I guess i've localized buggy place, at least it's looks strange. > ext4_da_get_block_prep () > { > ... > BUG_ON(create == 0); > BUG_ON(bh_result->b_size != inode->i_sb->s_blocksize); > ret = ext4_get_blocks_wrap(NULL, inode, iblock, 1, bh_result, 0, 0); > #Here ext4_get_block_write called with max_blocks == 1 ^^^^^ > ... > if (ret > 0) { > bh_result->b_size = (ret << inode->i_blkbits); > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > ## I don't understand this place. I hoped what (ret <= max_blocks) must always > ##be true true. But after I've add debug info printing I've got following result. > ret = 0; > } > ... > } > Some times I've seen following ,message > bh= {state=0,size=114688, blknr=18446744073709551615 dev=0000000000000000,count=0}, ret=28 > And because it was page-cache's bh later this result in WARNING. I think the root cause is here, ext4_get_block_wrap() could returns number of blocks greater than the caller is asking for, and set the mapped/allocated bytes in the bh->b_size. The problem is that the for buffered IO (without delaloc) get_block() via ext4_get_block_wrap() at write_begin time makes sure the buffer is mapped, so later at the writepage()->block_write_full_page() time, it never hits the branch the WARN_ON(bh->b_size != blocksize) in __block_write_full_page(), even if the b_size is previously changed to greater than the blocksize, by ext4_get_block_wrap() at the write_begin time. This warning is only seen with delayed allocation because we did a get_block() (via ext4_da_get_block_prep()) look up with 1 block at a time, but the bh->b_size is storing the length of the whole extent, since ext4_get_block_wrap() could returns number of blocks greater than the caller is asking for. static int __block_write_full_page(struct inode *inode, struct page *page, get_block_t *get_block, struct writeback_control *wbc) { .... if (!buffer_mapped(bh) && buffer_dirty(bh)) { WARN_ON(bh->b_size != blocksize); err = get_block(inode, block, bh, 1); if (err) goto recover; if (buffer_new(bh)) { /* blockdev mappings never come here */ clear_buffer_new(bh); unmap_underlying_metadata(bh->b_bdev, bh->b_blocknr); } } I think the fix probabaly should enforce ext4_get_blocks_handle()/ext4_ext_get_block() never map/allocate the number of blocks that more than what is asking for.. Mingming > > > > -Eric > > -- > > To unsubscribe from this list: send the line "unsubscribe linux-ext4" in > > the body of a message to majordomo@xxxxxxxxxxxxxxx > > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- > To unsubscribe from this list: send the line "unsubscribe linux-ext4" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html