The patch titled Subject: ocfs2: clear zero in unaligned direct IO has been added to the -mm tree. Its filename is ocfs2-clear-zero-in-unaligned-direct-io.patch This patch should soon appear at http://ozlabs.org/~akpm/mmots/broken-out/ocfs2-clear-zero-in-unaligned-direct-io.patch and later at http://ozlabs.org/~akpm/mmotm/broken-out/ocfs2-clear-zero-in-unaligned-direct-io.patch Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/process/submit-checklist.rst when testing your code *** The -mm tree is included into linux-next and is updated there every 3-4 working days ------------------------------------------------------ From: Jia Guo <guojia12@xxxxxxxxxx> Subject: ocfs2: clear zero in unaligned direct IO Unused portion of a part-written fs-block-sized block is not set to zero in unaligned append direct write.This can lead to serious data inconsistencies. Ocfs2 manage disk with cluster size(for example, 1M), part-written in one cluster will change the cluster state from UN-WRITTEN to WRITTEN, VFS(function dio_zero_block) doesn't do the cleaning because bh's state is not set to NEW in function ocfs2_dio_wr_get_block when we write a WRITTEN cluster. For example, the cluster size is 1M, file size is 8k and we direct write from 14k to 15k, then 12k~14k and 15k~16k will contain dirty data. We have to deal with two cases: 1.The starting position of direct write is outside the file. 2.The starting position of direct write is located in the file. We need set bh's state to NEW in the first case. In the second case, we need mapped twice because bh's state of area out file should be set to NEW while area in file not. Link: http://lkml.kernel.org/r/5292e287-8f1a-fd4a-1a14-661e555e0bed@xxxxxxxxxx Signed-off-by: Jia Guo <guojia12@xxxxxxxxxx> Reviewed-by: Yiwen Jiang <jiangyiwen@xxxxxxxxxx> Cc: Mark Fasheh <mfasheh@xxxxxxxxxxx> Cc: Joel Becker <jlbec@xxxxxxxxxxxx> Cc: Junxiao Bi <junxiao.bi@xxxxxxxxxx> Cc: Joseph Qi <joseph.qi@xxxxxxxxxx> Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> --- --- a/fs/ocfs2/aops.c~ocfs2-clear-zero-in-unaligned-direct-io +++ a/fs/ocfs2/aops.c @@ -2152,13 +2152,30 @@ static int ocfs2_dio_wr_get_block(struct struct ocfs2_dio_write_ctxt *dwc = NULL; struct buffer_head *di_bh = NULL; u64 p_blkno; - loff_t pos = iblock << inode->i_sb->s_blocksize_bits; + unsigned i_blkbits = inode->i_sb->s_blocksize_bits; + loff_t pos = iblock << i_blkbits; + sector_t endblk = (i_size_read(inode) - 1) >> i_blkbits; unsigned len, total_len = bh_result->b_size; int ret = 0, first_get_block = 0; len = osb->s_clustersize - (pos & (osb->s_clustersize - 1)); len = min(total_len, len); + /* + * bh_result->b_size is count in get_more_blocks according to write + * "pos" and "end", we need map twice to return different buffer state: + * 1. area in file size, not set NEW; + * 2. area out file size, set NEW. + * + * iblock endblk + * |--------|---------|---------|--------- + * |<-------area in file------->| + */ + + if ((iblock <= endblk) && + ((iblock + ((len - 1) >> i_blkbits)) > endblk)) + len = (endblk - iblock + 1) << i_blkbits; + mlog(0, "get block of %lu at %llu:%u req %u\n", inode->i_ino, pos, len, total_len); @@ -2242,6 +2259,9 @@ static int ocfs2_dio_wr_get_block(struct if (desc->c_needs_zero) set_buffer_new(bh_result); + if (iblock > endblk) + set_buffer_new(bh_result); + /* May sleep in end_io. It should not happen in a irq context. So defer * it to dio work queue. */ set_buffer_defer_completion(bh_result); _ Patches currently in -mm which might be from guojia12@xxxxxxxxxx are ocfs2-optimize-the-reading-of-heartbeat-data.patch ocfs2-clear-zero-in-unaligned-direct-io.patch