Hi Jan, Thanks for reviewing. On Tue, Aug 18, 2020 at 11:52 AM Jan Kara <jack@xxxxxxx> wrote: > > On Sun 09-08-20 22:02:05, Mauricio Faria de Oliveira wrote: > > Add the callbacks as opt-in to override the default behavior for > > the transaction's inode list, instead of moving that code around. > > > > This is important as not only ext4 uses the inode list: ocfs2 too, > > via jbd2_journal_inode_ranged_write(), and maybe out-of-tree code. > > I'd prefer if the callback is called unconditionally, jbd2 exports the > callback that implements the current behavior and and both ext4 & ocfs2 > are adapted to use this callback. We don't care about out of tree code. > That way things are cleaner long term... Understood. > > > To opt-out of the default behavior (i.e., to do nothing), one has > > to opt-in with a no-op function. > > Your Signed-off-by is missing for this patch. Oh, I thought it wasn't needed in RFC patches. Thanks for the suggestions below; they're more precise and descriptive. I had a few questions in the cover letter, but in case you didn't have the time, I'll just try harder on them; no worries. Kind regards, Mauricio > > > --- > > fs/jbd2/commit.c | 21 ++++++++++++++++----- > > include/linux/jbd2.h | 21 ++++++++++++++++++++- > > 2 files changed, 36 insertions(+), 6 deletions(-) > > > > diff --git a/fs/jbd2/commit.c b/fs/jbd2/commit.c > > index 51f713089e35..b98d227b50d8 100644 > > --- a/fs/jbd2/commit.c > > +++ b/fs/jbd2/commit.c > > @@ -237,10 +237,14 @@ static int journal_submit_data_buffers(journal_t *journal, > > * instead of writepages. Because writepages can do > > * block allocation with delalloc. We need to write > > * only allocated blocks here. > > + * This can be overriden with a custom callback. > > */ > > trace_jbd2_submit_inode_data(jinode->i_vfs_inode); > > - err = journal_submit_inode_data_buffers(mapping, dirty_start, > > - dirty_end); > > + if (journal->j_submit_inode_data_buffers) > > + err = journal->j_submit_inode_data_buffers(jinode); > > + else > > + err = journal_submit_inode_data_buffers(mapping, > > + dirty_start, dirty_end); > > if (!ret) > > ret = err; > > spin_lock(&journal->j_list_lock); > > @@ -274,9 +278,16 @@ static int journal_finish_inode_data_buffers(journal_t *journal, > > continue; > > jinode->i_flags |= JI_COMMIT_RUNNING; > > spin_unlock(&journal->j_list_lock); > > - err = filemap_fdatawait_range_keep_errors( > > - jinode->i_vfs_inode->i_mapping, dirty_start, > > - dirty_end); > > + /* > > + * Wait for the inode data buffers writeout. > > + * This can be overriden with a custom callback. > > + */ > > + if (journal->j_finish_inode_data_buffers) > > + err = journal->j_finish_inode_data_buffers(jinode); > > + else > > + err = filemap_fdatawait_range_keep_errors( > > + jinode->i_vfs_inode->i_mapping, > > + dirty_start, dirty_end); > > if (!ret) > > ret = err; > > spin_lock(&journal->j_list_lock); > > diff --git a/include/linux/jbd2.h b/include/linux/jbd2.h > > index d56128df2aff..24efe88eda1b 100644 > > --- a/include/linux/jbd2.h > > +++ b/include/linux/jbd2.h > > @@ -628,7 +628,8 @@ struct transaction_s > > struct journal_head *t_shadow_list; > > > > /* > > - * List of inodes whose data we've modified in data=ordered mode. > > + * List of inodes whose data we've modified in data=ordered mode > > + * or whose pages we should write-protect in data=journaled mode. > > I'd rather change the comment to generic "List of inodes associated with > the transaction. E.g. ext4 uses this to track inodes in data=ordered and > data=journal mode that need special handling on transaction commit.". > > > * [j_list_lock] > > */ > > struct list_head t_inode_list; > > @@ -1110,6 +1111,24 @@ struct journal_s > > void (*j_commit_callback)(journal_t *, > > transaction_t *); > > > > + /** > > + * @j_submit_inode_data_buffers: > > + * > > + * This function is called before flushing metadata buffers. > > + * This overrides the default behavior (writeout data buffers.) > > + */ > > I'd change the comment to: > * This function is called for all inodes associated with the > * committing transaction marked with JI_WRITE_DATA flag before we > * start to write out the transaction to the journal. > > > + int (*j_submit_inode_data_buffers) > > + (struct jbd2_inode *); > > + > > + /** > > + * @j_finish_inode_data_buffers: > > + * > > + * This function is called after flushing metadata buffers. > > + * This overrides the default behavior (wait writeout.) > > + */ > > And here: > * This function is called for all inodes associated with the > * committing transaction marked with JI_WAIT_DATA flag after we > * we have written the transaction to the journal but before we > * write out the commit block. > > > > + int (*j_finish_inode_data_buffers) > > + (struct jbd2_inode *); > > + > > /* > > * Journal statistics > > */ > > Honza > -- > Jan Kara <jack@xxxxxxxx> > SUSE Labs, CR -- Mauricio Faria de Oliveira