On Thu, Aug 8, 2024 at 8:07 AM Ryusuke Konishi wrote: > > After commit a694291a6211 ("nilfs2: separate wait function from > nilfs_segctor_write") was applied, the log writing function > nilfs_segctor_do_construct() was able to issue I/O requests > continuously even if user data blocks were split into multiple logs > across segments, but two potential flaws were introduced in its error > handling. > > First, if nilfs_segctor_begin_construction() fails while creating the > second or subsequent logs, the log writing function returns without > calling nilfs_segctor_abort_construction(), so the writeback flag set > on pages/folios will remain uncleared. This causes page cache > operations to hang waiting for the writeback flag. For example, > truncate_inode_pages_final(), which is called via nilfs_evict_inode() > when an inode is evicted from memory, will hang. > > Second, the NILFS_I_COLLECTED flag set on normal inodes remain > uncleared. As a result, if the next log write involves checkpoint > creation, that's fine, but if a partial log write is performed that > does not, inodes with NILFS_I_COLLECTED set are erroneously removed > from the "sc_dirty_files" list, and their data and b-tree blocks may > not be written to the device, corrupting the block mapping. > > Fix these issues by correcting the jump destination of the error > branch in nilfs_segctor_do_construct() and the condition for calling > nilfs_redirty_inodes(), which clears the NILFS_I_COLLECTED flag. > > Signed-off-by: Ryusuke Konishi <konishi.ryusuke@xxxxxxxxx> > Fixes: a694291a6211 ("nilfs2: separate wait function from nilfs_segctor_write") > Tested-by: Ryusuke Konishi <konishi.ryusuke@xxxxxxxxx> > Cc: stable@xxxxxxxxxxxxxxx > --- > Hi Andrew, please apply this as a bug fix. > > This fixes error path flaws of the log writing function that was > discovered during error injection testing, which could lead to a hang > due to the writeback flag not being cleared on folios, and potential > filesystem corruption due to missing blocks in the log after an error. > > Thanks, > Ryusuke Konishi Andrew, please stop sending this patch upstream. I found a problem with changing the error path in this patch in another error injection test, so I would like to create a revised version. The other two bug fix patches I have sent will not be affected. Thanks, Ryusuke Konishi > > fs/nilfs2/segment.c | 7 +++---- > 1 file changed, 3 insertions(+), 4 deletions(-) > > diff --git a/fs/nilfs2/segment.c b/fs/nilfs2/segment.c > index 0ca3110d6386..8b3225bd08ed 100644 > --- a/fs/nilfs2/segment.c > +++ b/fs/nilfs2/segment.c > @@ -2056,7 +2056,7 @@ static int nilfs_segctor_do_construct(struct nilfs_sc_info *sci, int mode) > > err = nilfs_segctor_begin_construction(sci, nilfs); > if (unlikely(err)) > - goto out; > + goto failed; > > /* Update time stamp */ > sci->sc_seg_ctime = ktime_get_real_seconds(); > @@ -2120,10 +2120,9 @@ static int nilfs_segctor_do_construct(struct nilfs_sc_info *sci, int mode) > return err; > > failed_to_write: > - if (sci->sc_stage.flags & NILFS_CF_IFILE_STARTED) > - nilfs_redirty_inodes(&sci->sc_dirty_files); > - > failed: > + if (mode == SC_LSEG_SR && nilfs_sc_cstage_get(sci) >= NILFS_ST_IFILE) > + nilfs_redirty_inodes(&sci->sc_dirty_files); > if (nilfs_doing_gc()) > nilfs_redirty_inodes(&sci->sc_gc_inodes); > nilfs_segctor_abort_construction(sci, nilfs, err); > -- > 2.34.1 >