On Tue, Apr 28, 2009 at 06:30:26PM -0700, Mingming wrote: > > On Wed, 2009-04-29 at 00:20 +0530, Aneesh Kumar K.V wrote: > > We need to mark the buffer_head mapping prealloc space > > as new during write_begin. Otherwise we don't zero out the > > page cache content properly for a partial write. This will > > cause file corruption with preallocation. > > > > Also use block number -1 as the fake block number so that > > unmap_underlying_metadata doesn't drop wrong buffer_head > > > > Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@xxxxxxxxxxxxxxxxxx> > > > > --- > > fs/ext4/inode.c | 11 ++++++++++- > > 1 files changed, 10 insertions(+), 1 deletions(-) > > > > diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c > > index e91f978..0214389 100644 > > --- a/fs/ext4/inode.c > > +++ b/fs/ext4/inode.c > > @@ -2318,11 +2318,20 @@ static int ext4_da_get_block_prep(struct inode *inode, sector_t iblock, > > /* not enough space to reserve */ > > return ret; > > > > - map_bh(bh_result, inode->i_sb, 0); > > + map_bh(bh_result, inode->i_sb, -1); > > set_buffer_new(bh_result); > > set_buffer_delay(bh_result); > > } else if (ret > 0) { > > bh_result->b_size = (ret << inode->i_blkbits); > > + bh_result->b_bdev = inode->i_sb->s_bdev; > > + bh->b_blocknr = -1; > > A small typo, should be bh_result->b_blocknr > > But isn't this will incorrect set up the b_blocknr for normal > successful(allocated, non preallocated) get_block lookup? As > ext4_get_blocks_wrap() will return 1 (>0) if it found it allocated. > > > + /* > > + * With sub-block writes into unwritten extents > > + * we also need to mark the buffer as new so that > > + * the unwritten parts of the buffer gets correctly zeroed. > > + */ > > + if (buffer_unwritten(bh_result)) > > + set_buffer_new(bh_result); > > ret = 0; > > } > > > > I think it nicer to setup the fake block_nr together when > set_buffer_new(), at the ext4_ext_get_block() time when it handles > preallocation lookup on delalloc. This will avoid calling > buffer_unwritten(bh_result) check for every return bh result for > ext4_get_blocks_wrap(). And makes the logic more saner. > > How about patch attached, tested with my testcase, the partial write > preallocation corruption is fixed. > > But looking at the comment change, looks like the original intention is > to set the buffer unwritten so that a read from that uninitialzed block > returns 0. Turns out the VFS needs to set the buffer new for this > purpose. Should work. My only concern is this change will have impact on the read path and for non delalloc case. For 2.6.30 I guess we can do the change only for delayed alloc case which is less intrusive.(ie to to change only ext4_da_get_block_prep). I have split the patches into two and will send a follow up patch. For .31 we want to do return with same buffer_head flags that xfs sets for delayed and unwritten extents. -aneesh -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html