Re: [PATCH v4] ext4: fix fast commit inode enqueueing during a full journal commit

Luis Henriques <luis.henriques@xxxxxxxxx> · Thu, 11 Jul 2024 16:16:18 +0100

On Thu, Jul 11 2024, wangjianjian (C) wrote:

> On 2024/7/11 16:35, Luis Henriques (SUSE) wrote:
>> When a full journal commit is on-going, any fast commit has to be enqueued
>> into a different queue: FC_Q_STAGING instead of FC_Q_MAIN.  This enqueueing
>> is done only once, i.e. if an inode is already queued in a previous fast
>> commit entry it won't be enqueued again.  However, if a full commit starts
>> _after_ the inode is enqueued into FC_Q_MAIN, the next fast commit needs to
>> be done into FC_Q_STAGING.  And this is not being done in function
>> ext4_fc_track_template().
>> This patch fixes the issue by re-enqueuing an inode into the STAGING queue
>> during the fast commit clean-up callback if it has a tid (i_sync_tid)
>> greater than the one being handled.  The STAGING queue will then be spliced
>> back into MAIN.
>> This bug was found using fstest generic/047.  This test creates several 32k
>> bytes files, sync'ing each of them after it's creation, and then shutting
>> down the filesystem.  Some data may be loss in this operation; for example a
>> file may have it's size truncated to zero.
>> Signed-off-by: Luis Henriques (SUSE) <luis.henriques@xxxxxxxxx>
>> ---
>> Hi!
>> v4 of this patch enqueues the inode into STAGING *only* if the current tid
>> is non-zero.  It will be zero when doing an fc commit, and this would mean
>> to always re-enqueue the inode.  This fixes the regressions caught by Ted
>> in v3 with fstests generic/472 generic/496 generic/643.
>> Also, since 2nd patch of v3 has already been merged, I've rebased this patch
>> to be applied on top of it.
>>   fs/ext4/fast_commit.c | 10 ++++++++++
>>   1 file changed, 10 insertions(+)
>> diff --git a/fs/ext4/fast_commit.c b/fs/ext4/fast_commit.c
>> index 3926a05eceee..facbc8dbbaa2 100644
>> --- a/fs/ext4/fast_commit.c
>> +++ b/fs/ext4/fast_commit.c
>> @@ -1290,6 +1290,16 @@ static void ext4_fc_cleanup(journal_t *journal, int full, tid_t tid)
>>   				       EXT4_STATE_FC_COMMITTING);
>>   		if (tid_geq(tid, iter->i_sync_tid))
>>   			ext4_fc_reset_inode(&iter->vfs_inode);
>> +		} else if (tid) {
>> +			/*
>> +			 * If the tid is valid (i.e. non-zero) re-enqueue the
> one quick question about tid, if one disk is using long time and its tid   get
> wrapped to 0, is it a valid seq? I don't find code handling this situation.

Hmm... OK.  So, to answer to your question, the 'tid' is expected to wrap.
That's why we use:

	if (tid_geq(tid, iter->i_sync_tid))

instead of:

	if (tid >= iter->i_sync_tid)

(The second patch in v3 actually fixed a few places where the tid_*()
helpers weren't being used.)

But your question shows me that my patch is wrong as '0' may actually be a
valid 'tid' value.

Cheers,
-- 
Luís

>> +			 * inode into STAGING, which will then be splice back
>> +			 * into MAIN
>> +			 */
>> +			list_add_tail(&EXT4_I(&iter->vfs_inode)->i_fc_list,
>> +				      &sbi->s_fc_q[FC_Q_STAGING]);
>> +		}
>> +
>>   		/* Make sure EXT4_STATE_FC_COMMITTING bit is clear */
>>   		smp_mb();
>>   #if (BITS_PER_LONG < 64)
>> 
> -- 
> Regards
>