On Wed, Feb 2, 2022 at 4:34 AM Ritesh Harjani <riteshh@xxxxxxxxxxxxx> wrote: > > Hello Xin, > > Sorry about revisiting this thread so late :( > Recently when I was working on one of the fast_commit issue, I got interested > in looking into some of those recent fast_commit fixes. > > Hence some of these queries. > > On 21/12/23 11:23AM, Xin Yin wrote: > > For now ,we use ext4_punch_hole() during fast commit replay delete range > > procedure. But it will be affected by inode->i_size, which may not > > correct during fast commit replay procedure. The following test will > > failed. > > > > -create & write foo (len 1000K) > > -falloc FALLOC_FL_ZERO_RANGE foo (range 400K - 600K) > > -create & fsync bar > ^^^^ do you mean "fsync foo" or is this actually a new file create and fsync > bar? bar is a new created file, it is the brother file of foo , it would be like this. ./foo ./bar > > > > -falloc FALLOC_FL_PUNCH_HOLE foo (range 300K-500K) > > -fsync foo > > -crash before a full commit > > > > After the fast_commit reply procedure, the range 400K-500K will not be > > removed. Because in this case, when calling ext4_punch_hole() the > > inode->i_size is 0, and it just retruns with doing nothing. > > I tried looking into this, but I am not able to put my head around that when > will the inode->i_size will be 0? > > So, what I think should happen is when you are doing falocate/fsync foo in your > above list of operations then, anyways the inode i_disksize will be updated > using ext4_mark_inode_dirty() and during replay phase inode->i_size will hold > the right value no? yes, the inode->i_size hold the right value and ext4_fc_replay_inode() will update inode to the final state, but during replay phase ext4_fc_replay_inode() usually is the last step, so before this the inode->i_size may not correct. > > Could you please help understand when, where and how will inode->i_size will be > 0? I didn't check why inode->i_size is 0, in this case. I just think inode->i_size should not affect the behavior of the replay phase. Another case is inode->i_size may not include unwritten blocks , and if a file has unwritten blocks at bottom, we can not use ext4_punch_hole() to remove the unwritten blocks beyond i_size during the replay phase. > > Also - it would be helpful if you have some easy reproducer of this issue you > mentioned. The attached test code can reproduce this issue, hope it helps. > > -ritesh > > > > > Change to use ext4_ext_remove_space() instead of ext4_punch_hole() > > to remove blocks of inode directly. > > > > Signed-off-by: Xin Yin <yinxin.x@xxxxxxxxxxxxx> > > --- > > fs/ext4/fast_commit.c | 13 ++++++++----- > > 1 file changed, 8 insertions(+), 5 deletions(-) > > > > diff --git a/fs/ext4/fast_commit.c b/fs/ext4/fast_commit.c > > index aa05b23f9c14..3deb97b22ca4 100644 > > --- a/fs/ext4/fast_commit.c > > +++ b/fs/ext4/fast_commit.c > > @@ -1708,11 +1708,14 @@ ext4_fc_replay_del_range(struct super_block *sb, struct ext4_fc_tl *tl, > > } > > } > > > > - ret = ext4_punch_hole(inode, > > - le32_to_cpu(lrange.fc_lblk) << sb->s_blocksize_bits, > > - le32_to_cpu(lrange.fc_len) << sb->s_blocksize_bits); > > - if (ret) > > - jbd_debug(1, "ext4_punch_hole returned %d", ret); > > + down_write(&EXT4_I(inode)->i_data_sem); > > + ret = ext4_ext_remove_space(inode, lrange.fc_lblk, > > + lrange.fc_lblk + lrange.fc_len - 1); > > + up_write(&EXT4_I(inode)->i_data_sem); > > + if (ret) { > > + iput(inode); > > + return 0; > > + } > > ext4_ext_replay_shrink_inode(inode, > > i_size_read(inode) >> sb->s_blocksize_bits); > > ext4_mark_inode_dirty(NULL, inode); > > -- > > 2.20.1 > >
#define _GNU_SOURCE #include <stdio.h> #include <stdlib.h> #include <errno.h> #include <unistd.h> #include <stdlib.h> #include <string.h> #include <fcntl.h> #include <sys/stat.h> #define FILE_SIZE 1024000 #define HOLE_START 409600 #define HOLE_LEN 204800 #define HOLE_SHIFET 102400 int main(int argc, char *argv[]) { int fd; int ret; void* data_foo; int fd_bar; char bar_path[256]; if (argc != 2) { printf("usage: a.out [file Path]\n"); exit(1); } sprintf(bar_path,"%s_bar",argv[1]); printf("file path: %s \n",argv[1]); printf("file_bar path: %s \n",bar_path); fd = open(argv[1], O_CREAT | O_RDWR , 0755); if (fd < 0) { printf("open err! \n"); exit(1); } data_foo = malloc(FILE_SIZE); if (!data_foo) { printf("malloc err! \n"); exit(1); } int offset_foo = 0; int to_write_foo = FILE_SIZE ; const char *text_foo = "ddddddddddklmnopqrstuvwxyz123456"; while (offset_foo < FILE_SIZE){ if (to_write_foo < 32){ memcpy((char *)data_foo+ offset_foo, text_foo, to_write_foo); offset_foo += to_write_foo; } else { memcpy((char *)data_foo+ offset_foo,text_foo, 32); offset_foo += 32; } } ret = pwrite(fd, data_foo, FILE_SIZE, 0); if (ret != FILE_SIZE) { printf("write err! [%d] \n",ret); exit(1); } ret = fallocate( fd , FALLOC_FL_ZERO_RANGE | FALLOC_FL_KEEP_SIZE , HOLE_START , HOLE_LEN); if ( ret < 0){ printf("fallocate err! [%d] \n",ret); exit(1); } fd_bar = open(bar_path, O_CREAT | O_RDWR , 0755); if (fd_bar < 0) { printf("open fd_bar err! \n"); exit(1); } fsync(fd_bar); close(fd_bar); ret = fallocate( fd , FALLOC_FL_PUNCH_HOLE|FALLOC_FL_KEEP_SIZE , HOLE_START-HOLE_SHIFET , HOLE_LEN); if ( ret < 0){ printf("fallocate err! [%d] \n",ret); exit(1); } fsync(fd); close(fd); exit(0); }