We already allocate delalloc blocks before changing the inode mode into "per-file data journal" mode to prevent delalloc blocks from remaining not allocated, but another issue concerned with "BH_Unwritten" status still exists. For example, by fallocate(), several buffers' status change into "BH_Unwritten", but these buffers cannot be processed by ext4_alloc_da_blocks(). So, they still remain in unwritten status after per-file data journaling is enabled and they cannot be changed into written status any more and, if they are journaled and eventually checkpointed, these unwritten buffer will cause a kernel panic by the below BUG_ON() function of submit_bh_wbc() when they are submitted during checkpointing. static int submit_bh_wbc(int rw, struct buffer_head *bh,... { ... BUG_ON(buffer_unwritten(bh)); Moreover, when "dioread_nolock" option is enabled, the status of a buffer is changed into "BH_Unwritten" after write_begin() completes and the "BH_Unwritten" status will be cleared after I/O is done. Therefore, if a buffer's status is changed into unwrutten but the buffer's I/O is not submitted and completed, it can cause the same problem after enabling per-file data journaling. You can easily generate this bug by executing the following command. ./kvm-xfstests -C 10000 -m nodelalloc,dioread_nolock generic/269 To resolve these problems and define a boundary between the previous mode and per-file data journaling mode, we need to flush and wait all the I/O of buffers of a file before enabling per-file data journaling of the file. Signed-off-by: Daeho Jeong <daeho.jeong@xxxxxxxxxxx> --- fs/ext4/inode.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index 612fbcf..1f9458e 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -5168,9 +5168,14 @@ int ext4_change_inode_journal_flag(struct inode *inode, int val) * be allocated any more. even more truncate on delalloc blocks * could trigger BUG by flushing delalloc blocks in journal. * There is no delalloc block in non-journal data mode. + * We also have to handle unwritten buffers generated by + * fallocate() and dioread_nolock option. Once per-file data + * journaling is enabled, unwritten buffers will remain in + * unwritten status forever and they will be the seeds of + * kernel panic when they are checkpointed. */ - if (val && test_opt(inode->i_sb, DELALLOC)) { - err = ext4_alloc_da_blocks(inode); + if (val) { + err = filemap_write_and_wait(inode->i_mapping); if (err < 0) return err; } -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html