Subject: + nilfs2-fix-segctor-bug-that-causes-file-system-corruption.patch added to -mm tree To: andreas.rohner@xxxxxxx,konishi.ryusuke@xxxxxxxxxxxxx,stable@xxxxxxxxxxxxxxx From: akpm@xxxxxxxxxxxxxxxxxxxx Date: Wed, 08 Jan 2014 12:47:17 -0800 The patch titled Subject: nilfs2: fix segctor bug that causes file system corruption has been added to the -mm tree. Its filename is nilfs2-fix-segctor-bug-that-causes-file-system-corruption.patch This patch should soon appear at http://ozlabs.org/~akpm/mmots/broken-out/nilfs2-fix-segctor-bug-that-causes-file-system-corruption.patch and later at http://ozlabs.org/~akpm/mmotm/broken-out/nilfs2-fix-segctor-bug-that-causes-file-system-corruption.patch Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/SubmitChecklist when testing your code *** The -mm tree is included into linux-next and is updated there every 3-4 working days ------------------------------------------------------ From: Andreas Rohner <andreas.rohner@xxxxxxx> Subject: nilfs2: fix segctor bug that causes file system corruption There is a bug in the function nilfs_segctor_collect, which results in active data being written to a segment, that is marked as clean. It is possible, that this segment is selected for a later segment construction, whereby the old data is overwritten. The problem shows itself with the following kernel log message: nilfs_sufile_do_cancel_free: segment 6533 must be clean Usually a few hours later the file system gets corrupted: NILFS: bad btree node (blocknr=8748107): level = 0, flags = 0x0, nchildren = 0 NILFS error (device sdc1): nilfs_bmap_last_key: broken bmap (inode number=114660) The issue can be reproduced with a file system that is nearly full and with the cleaner running, while some IO intensive task is running. Although it is quite hard to reproduce. This is what happens: 1. The cleaner starts the segment construction 2. nilfs_segctor_collect is called 3. sc_stage is on NILFS_ST_SUFILE and segments are freed 4. sc_stage is on NILFS_ST_DAT current segment is full 5. nilfs_segctor_extend_segments is called, which allocates a new segment 6. The new segment is one of the segments freed in step 3 7. nilfs_sufile_cancel_freev is called and produces an error message 8. Loop around and the collection starts again 9. sc_stage is on NILFS_ST_SUFILE and segments are freed including the newly allocated segment, which will contain active data and can be allocated at a later time 10. A few hours later another segment construction allocates the segment and causes file system corruption This can be prevented by simply reordering the statements. If nilfs_sufile_cancel_freev is called before nilfs_segctor_extend_segments the freed segments are marked as dirty and cannot be allocated any more. Signed-off-by: Andreas Rohner <andreas.rohner@xxxxxxx> Reviewed-by: Ryusuke Konishi <konishi.ryusuke@xxxxxxxxxxxxx> Tested-by: Andreas Rohner <andreas.rohner@xxxxxxx> Signed-off-by: Ryusuke Konishi <konishi.ryusuke@xxxxxxxxxxxxx> Cc: <stable@xxxxxxxxxxxxxxx> Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> --- fs/nilfs2/segment.c | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff -puN fs/nilfs2/segment.c~nilfs2-fix-segctor-bug-that-causes-file-system-corruption fs/nilfs2/segment.c --- a/fs/nilfs2/segment.c~nilfs2-fix-segctor-bug-that-causes-file-system-corruption +++ a/fs/nilfs2/segment.c @@ -1440,17 +1440,19 @@ static int nilfs_segctor_collect(struct nilfs_clear_logs(&sci->sc_segbufs); - err = nilfs_segctor_extend_segments(sci, nilfs, nadd); - if (unlikely(err)) - return err; - if (sci->sc_stage.flags & NILFS_CF_SUFREED) { err = nilfs_sufile_cancel_freev(nilfs->ns_sufile, sci->sc_freesegs, sci->sc_nfreesegs, NULL); WARN_ON(err); /* do not happen */ + sci->sc_stage.flags &= ~NILFS_CF_SUFREED; } + + err = nilfs_segctor_extend_segments(sci, nilfs, nadd); + if (unlikely(err)) + return err; + nadd = min_t(int, nadd << 1, SC_MAX_SEGDELTA); sci->sc_stage = prev_stage; } _ Patches currently in -mm which might be from andreas.rohner@xxxxxxx are nilfs2-fix-segctor-bug-that-causes-file-system-corruption.patch -- To unsubscribe from this list: send the line "unsubscribe stable" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html