Re: [PATCH] jbd2: wake up journal waiters in FIFO order, not LIFO

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Ritesh,

This was hit on the Lustre OSS node when we have ton’s of short write with sync/(journal commit) in parallel.
Each write was done from own thread (like 1k-2k threads in parallel).
It caused a situation when only few/some threads make a wakeup and enter to the transaction until it will be T_LOCKED.
In our’s observation all handles from head was waked and it’s handles added recently, while old handles still in list and
It caused a soft lockup messages on console.

Alex


> On 8 Sep 2022, at 08:46, Ritesh Harjani (IBM) <ritesh.list@xxxxxxxxx> wrote:
> 
> On 22/09/07 07:59PM, Alexey Lyashkov wrote:
>> From: Andrew Perepechko <anserper@xxxxx>
>> 
>> LIFO wakeup order is unfair and sometimes leads to a journal
>> user not being able to get a journal handle for hundreds of
>> transactions in a row.
>> 
>> FIFO wakeup can make things more fair.
> 
> prepare_to_wait() will always add the task to the head of the list.
> While prepare_to_wait_exclusive() will add the task to the tail since all of the
> exclusive tasks are added to the tail.
> wake_up() function will wake up all non-exclusive and single exclusive task 
> v/s
> wake_up_all() function will wake up all tasks irrespective.
> 
> So your change does makes the ordering to FIFO, in which the task which came in 
> first will be woken up first. 
> 
> Although I was wondering about 2 things - 
> 1. In what scenario this was observed to become a problem/bottleneck for you?
> Could you kindly give more details of your problem?
> 
> 2. What about start_this_handle() function where we call wait_event() 
> for j_barrier_count to be 0? I guess that doesn't happen often.
> 
> -ritesh
> 
> 
>> 
>> Signed-off-by: Alexey Lyashkov <alexey.lyashkov@xxxxxxxxx>
>> ---
>> fs/jbd2/commit.c      | 2 +-
>> fs/jbd2/transaction.c | 6 +++---
>> 2 files changed, 4 insertions(+), 4 deletions(-)
>> 
>> diff --git a/fs/jbd2/commit.c b/fs/jbd2/commit.c
>> index b2b2bc9b88d9..ec2b55879e3a 100644
>> --- a/fs/jbd2/commit.c
>> +++ b/fs/jbd2/commit.c
>> @@ -570,7 +570,7 @@ void jbd2_journal_commit_transaction(journal_t *journal)
>> 	journal->j_running_transaction = NULL;
>> 	start_time = ktime_get();
>> 	commit_transaction->t_log_start = journal->j_head;
>> -	wake_up(&journal->j_wait_transaction_locked);
>> +	wake_up_all(&journal->j_wait_transaction_locked);
>> 	write_unlock(&journal->j_state_lock);
>> 
>> 	jbd2_debug(3, "JBD2: commit phase 2a\n");
>> diff --git a/fs/jbd2/transaction.c b/fs/jbd2/transaction.c
>> index e1be93ccd81c..6a404ac1c178 100644
>> --- a/fs/jbd2/transaction.c
>> +++ b/fs/jbd2/transaction.c
>> @@ -168,7 +168,7 @@ static void wait_transaction_locked(journal_t *journal)
>> 	int need_to_start;
>> 	tid_t tid = journal->j_running_transaction->t_tid;
>> 
>> -	prepare_to_wait(&journal->j_wait_transaction_locked, &wait,
>> +	prepare_to_wait_exclusive(&journal->j_wait_transaction_locked, &wait,
>> 			TASK_UNINTERRUPTIBLE);
>> 	need_to_start = !tid_geq(journal->j_commit_request, tid);
>> 	read_unlock(&journal->j_state_lock);
>> @@ -194,7 +194,7 @@ static void wait_transaction_switching(journal_t *journal)
>> 		read_unlock(&journal->j_state_lock);
>> 		return;
>> 	}
>> -	prepare_to_wait(&journal->j_wait_transaction_locked, &wait,
>> +	prepare_to_wait_exclusive(&journal->j_wait_transaction_locked, &wait,
>> 			TASK_UNINTERRUPTIBLE);
>> 	read_unlock(&journal->j_state_lock);
>> 	/*
>> @@ -920,7 +920,7 @@ void jbd2_journal_unlock_updates (journal_t *journal)
>> 	write_lock(&journal->j_state_lock);
>> 	--journal->j_barrier_count;
>> 	write_unlock(&journal->j_state_lock);
>> -	wake_up(&journal->j_wait_transaction_locked);
>> +	wake_up_all(&journal->j_wait_transaction_locked);
>> }
>> 
>> static void warn_dirty_buffer(struct buffer_head *bh)
>> -- 
>> 2.31.1
>> 





[Index of Archives]     [Reiser Filesystem Development]     [Ceph FS]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite National Park]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Device Mapper]     [Linux Media]

  Powered by Linux