On Fri, May 17, 2019 at 08:56:35PM +0800, Hou Tao wrote: > Hi, > > On 2019/5/17 14:59, Dave Chinner wrote: > > On Fri, May 17, 2019 at 10:41:44AM +0800, Hou Tao wrote: > >> Hi, > >> > >> I don't understand why the commit b450672fb66b ("iomap: sub-block dio needs to zeroout beyond EOF") is needed here: > >> > >> diff --git a/fs/iomap.c b/fs/iomap.c > >> index 72f3864a2e6b..77c214194edf 100644 > >> --- a/fs/iomap.c > >> +++ b/fs/iomap.c > >> @@ -1677,7 +1677,14 @@ iomap_dio_bio_actor(struct inode *inode, loff_t pos, loff_t length, > >> dio->submit.cookie = submit_bio(bio); > >> } while (nr_pages); > >> > >> - if (need_zeroout) { > >> + /* > >> + * We need to zeroout the tail of a sub-block write if the extent type > >> + * requires zeroing or the write extends beyond EOF. If we don't zero > >> + * the block tail in the latter case, we can expose stale data via mmap > >> + * reads of the EOF block. > >> + */ > >> + if (need_zeroout || > >> + ((dio->flags & IOMAP_DIO_WRITE) && pos >= i_size_read(inode))) { > >> /* zero out from the end of the write to the end of the block */ > >> pad = pos & (fs_block_size - 1); > >> if (pad) > >> > >> If need_zeroout is false, it means the block neither is a unwritten block nor > >> a newly-mapped block, but that also means the block must had been a unwritten block > >> or a newly-mapped block before this write, so the block must have been zeroed, correct ? > > > > No. One the contrary, it's a direct IO write to beyond the end of > > the file which means the block has not been zeroed at all. If it is > > an unwritten extent, it most definitely does not contain zeroes > > (unwritten extents are a flag in the extent metadata, not zeroed > > disk space) and so it doesn't matter it is written or unwritten we > > must zero it before we update the file size. > > > Ah, I still can not understand the reason why "the block has not been zeroed at all". > Do you mean the scenario in which the past-EOF write tries to write an already mapped > block and the past-EOF part of this block has not been zeroed ? Yup. Speculative delayed allocation beyond EOF triggered by mmap() or buffered writes() that has then been allocated by writeback, then we do a dio that extends EOF into that space. > Because if the past-EOF write tries to write to a new allocated block, then IOMAP_F_NEW > must have been set in iomap->flags and need_zeroout will be true. If it tries to write > to an unwritten block, need_zeroout will also be true. But if it tries to write to an already allocated, written block, need_zeroout is not true and it will need to zero. > > If it tries to write a block in which the past-EOF part has not been zeroed, even without > the past-EOF direct write, the data exposure is still possible, right ? Not if we zero both sides of the dio correctly before we extend the file size on disk. Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx