On Wed 28-10-15 17:19:02, Eunryoung Lim wrote: > Hello, > > I am Eunryoung Lim and have a question about EXT4 journaling source. > I will be appreciated for your any comments. > > * ordered journaling mode, disabled async_jounral_commit > On my understating, below steps is the simple process of journal > transaction commit (jbd2_journal_commit_transaction() ). > Step 1,3,4 are optional. > > 1. Write journal revoke block (if it required). > 2. Write journal descriptor + journal buffer’s page. > 3. Submit all the data buffers of inode associated with the > transaction (journal_submit_data_buffers() ) (if it required). > 4. Wait for data submitted for writeback page > (journal_finish_inode_data_buffers() ) (if it required). > 5. Write journal commit mark. > > The problem is that JBD2 issue unnecessary journal commit mark I/O. > For example, JBD2 write journal commit mark (step 5), even though > transaction (running ->commit state) has no journal buffer yet. > The I/O for journal descriptor + journal buffer’s page (step 2) can be > skipped because JBD2 check whether the transaction has journal buffer > or not. > void jbd2_journal_commit_transaction(journal_t *journal) > { > …. > /* start to commit transaction */ > while (commit_transaction->t_buffers) { > /* Find the next buffer to be journaled... */ > jh = commit_transaction->t_buffers; > .… > } > > I think, if the journal buffers do not exist in the transaction, JBD2 > can skip step 5 to reduce amount of I/O. > Thus, the code can be modified as follows: (this is simple way, we can > optimize more) > (current) > err = journal_submit_commit_record(journal, commit_transaction, > (after) > if ( IS_ERR_OR_NULL(commit_transaction->t_buffers) ) { > err = journal_submit_commit_record(journal, commit_transaction, > } > or we can simply return the function. > (add)if ( IS_ERR_OR_NULL(commit_transaction->t_buffers) ) > return ; > > /* start to commit transaction */ > while (commit_transaction->t_buffers) { > .... > > Note. Android uses SQLite, and it makes sure that directory entry > contains newly created journal file by calling fsync() to the > directory. > (However, SQLite call fsync() to directory even though the file is not created). > When fsync() is called to directory, > EXT4 calls ext4_sync_file() -> ext4_force_commit() -> > ext4_journal_force_commit() -> jbd2_journal_force_commit() -> ... and > finally, jbd2_journal_commit_transaction(); > If journal transaction is empty, JBD2 only write journal commit mark > and it causes lots of I/Os. However note that __jbd2_journal_force_commit() doesn't do anything if there isn't running transaction. Are you sure the commit contains only the commit block? It is possible but someone would have to call jbd2_journal_start() and then don't dirty any buffers... Frankly, skipping empty commit seems like a rather special case and solving the problem at the wrong level. I think we rather shouldn't force the transaction commit if the directory is unchanged. We can use i_sync_tid in a similar way as we do for regular files, we just have to add that tracking into directory code. That would save us much more commits. Honza -- Jan Kara <jack@xxxxxxxx> SUSE Labs, CR -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html