On Fri 06-10-17 10:04:41, HUANG Weller (CM/ESW12-CN) wrote: > Hello Mr. Davis and Mr. Kara, > > I checked the latest 3.14 source code 3.14.79. I didn't found the below patch. > > It should be there , right ? Well, I'm not sure who runs the 3.14 stable tree (it's not listed at kernel.org). It's up to him to pick up patches... Honza > > -----Original Message----- > > From: Jan Kara [mailto:jack@xxxxxxx] > > Sent: Wednesday, June 29, 2016 3:46 PM > > To: George G. Davis <george_davis@xxxxxxxxxx> > > Cc: stable@xxxxxxxxxxxxxxx; Greg KH <gregkh@xxxxxxxxxxxxxxxxxxx>; Behme Dirk > > (CM/ESO2) <Dirk.Behme@xxxxxxxxxxxx>; Jan Kara <jack@xxxxxxx>; Theodore > > Ts'o <tytso@xxxxxxx>; HUANG Weller (CM/ESW12-CN) > > <Weller.Huang@xxxxxxxxxxxx> > > Subject: Re: [PATCH 3.14] ext4: fix data exposure after a crash > > > > The backport looks good to me. > > > > Honza > > > > On Tue 28-06-16 18:31:48, George G. Davis wrote: > > > From: Jan Kara <jack@xxxxxxx> > > > > > > From: Jan Kara <jack@xxxxxxx> > > > > > > commit 06bd3c36a733ac27962fea7d6f47168841376824 upstream > > > > > > Huang has reported that in his powerfail testing he is seeing stale > > > block contents in some of recently allocated blocks although he mounts > > > ext4 in data=ordered mode. After some investigation I have found out > > > that indeed when delayed allocation is used, we don't add inode to > > > transaction's list of inodes needing flushing before commit. > > > Originally we were doing that but commit f3b59291a69d removed the > > > logic with a flawed argument that it is not needed. > > > > > > The problem is that although for delayed allocated blocks we write > > > their contents immediately after allocating them, there is no > > > guarantee that the IO scheduler or device doesn't reorder things and > > > thus transaction allocating blocks and attaching them to inode can > > > reach stable storage before actual block contents. Actually whenever > > > we attach freshly allocated blocks to inode using a written extent, we > > > should add inode to transaction's ordered inode list to make sure we > > > properly wait for block contents to be written before committing the > > > transaction. So that is what we do in this patch. This also handles > > > other cases where stale data exposure was possible - like filling hole > > > via mmap in data=ordered,nodelalloc mode. > > > > > > The only exception to the above rule are extending direct IO writes > > > where > > > blkdev_direct_IO() waits for IO to complete before increasing i_size > > > and thus stale data exposure is not possible. For now we don't > > > complicate the code with optimizing this special case since the > > > overhead is pretty low. In case this is observed to be a performance > > > problem we can always handle it using a special flag to ext4_map_blocks(). > > > > > > Fixes: f3b59291a69d0b734be1fc8be489fef2dd846d3d > > > Reported-by: "HUANG Weller (CM/ESW12-CN)" <Weller.Huang@xxxxxxxxxxxx> > > > Tested-by: "HUANG Weller (CM/ESW12-CN)" <Weller.Huang@xxxxxxxxxxxx> > > > Signed-off-by: Jan Kara <jack@xxxxxxx> > > > Signed-off-by: Theodore Ts'o <tytso@xxxxxxx> > > > [weller: fix conflict with 3.14 kernel] > > > Signed-off-by: weller huang <weller.huang@xxxxxxxxxxxx> > > > Signed-off-by: George G. Davis <george_davis@xxxxxxxxxx> > > > --- > > > gdavis: Confirmed that backport conflicts are due to lack of upstream > > > commits c86d8db ("ext4: implement allocation of pre-zeroed > > > blocks") and 09cbfea ("mm, fs: get rid of PAGE_CACHE_* and > > > page_cache_{get,release} macros") in v3.14.37. The conflict > > > resolution therefore appears to be correct. > > > --- > > > fs/ext4/inode.c | 23 ++++++++++++++--------- > > > 1 file changed, 14 insertions(+), 9 deletions(-) > > > > > > diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index 58001fc..d33a80e > > > 100644 > > > --- a/fs/ext4/inode.c > > > +++ b/fs/ext4/inode.c > > > @@ -695,6 +695,20 @@ has_zeroout: > > > int ret = check_block_validity(inode, map); > > > if (ret != 0) > > > return ret; > > > + > > > + /* > > > + * Inodes with freshly allocated blocks where contents will be > > > + * visible after transaction commit must be on transaction's > > > + * ordered data list. > > > + */ > > > + if (map->m_flags & EXT4_MAP_NEW && > > > + !(map->m_flags & EXT4_MAP_UNWRITTEN) && > > > + !IS_NOQUOTA(inode) && > > > + ext4_should_order_data(inode)) { > > > + ret = ext4_jbd2_file_inode(handle, inode); > > > + if (ret) > > > + return ret; > > > + } > > > } > > > return retval; > > > } > > > @@ -1059,15 +1073,6 @@ static int ext4_write_end(struct file *file, > > > int i_size_changed = 0; > > > > > > trace_ext4_write_end(inode, pos, len, copied); > > > - if (ext4_test_inode_state(inode, EXT4_STATE_ORDERED_MODE)) { > > > - ret = ext4_jbd2_file_inode(handle, inode); > > > - if (ret) { > > > - unlock_page(page); > > > - page_cache_release(page); > > > - goto errout; > > > - } > > > - } > > > - > > > if (ext4_has_inline_data(inode)) { > > > ret = ext4_write_inline_data_end(inode, pos, len, > > > copied, page); > > > -- > > > 1.9.3 > > > > > -- > > Jan Kara <jack@xxxxxxxx> > > SUSE Labs, CR -- Jan Kara <jack@xxxxxxxx> SUSE Labs, CR