Hello! On Thu 10-09-20 11:08:06, Ye Bin wrote: > As we test disk offline/online with running fsstress, we find fsstress > process is keeping running state. > kworker/u32:3-262 [004] ...1 140.787471: ext4_mb_discard_preallocations: dev 8,32 needed 114 > .... > kworker/u32:3-262 [004] ...1 140.787471: ext4_mb_discard_preallocations: dev 8,32 needed 114 > > ext4_mb_new_blocks > repeat: > ext4_mb_discard_preallocations_should_retry(sb, ac, &seq) > freed = ext4_mb_discard_preallocations > ext4_mb_discard_group_preallocations > this_cpu_inc(discard_pa_seq); > ---> freed == 0 > seq_retry = ext4_get_discard_pa_seq_sum > for_each_possible_cpu(__cpu) > __seq += per_cpu(discard_pa_seq, __cpu); > if (seq_retry != *seq) { > *seq = seq_retry; > ret = true; > } > > As we see seq_retry is sum of discard_pa_seq every cpu, if > ext4_mb_discard_group_preallocations return zero discard_pa_seq in this > cpu maybe increase one, so condition "seq_retry != *seq" have always > been met. > To Fix this problem, ext4_get_discard_pa_seq_sum function couldn't add > own's cpu "discard_pa_seq" value. > > Fixes: 07b5b8e1ac40 ("ext4: mballoc: introduce pcpu seqcnt for freeing PA to improve ENOSPC handling") > Signed-off-by: Ye Bin <yebin10@xxxxxxxxxx> Thanks for the patch! I agree with your analysis but avoiding current CPU in the sum is wrong. After all there's nothing which prevents the scheduler from rescheduling your task among different CPUs while it is searching for preallocations to discard. The correct fix is IMO to change ext4_mb_discard_group_preallocations() so that it increments discard_pa_seq only when it found preallocation to discard (and is thus guaranteed to return value > 0). Honza > --- > fs/ext4/mballoc.c | 6 ++++-- > 1 file changed, 4 insertions(+), 2 deletions(-) > > diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c > index 132c118d12e1..f386fe62727d 100644 > --- a/fs/ext4/mballoc.c > +++ b/fs/ext4/mballoc.c > @@ -372,11 +372,13 @@ static void ext4_mb_new_preallocation(struct ext4_allocation_context *ac); > static DEFINE_PER_CPU(u64, discard_pa_seq); > static inline u64 ext4_get_discard_pa_seq_sum(void) > { > - int __cpu; > + int __cpu, this_cpu; > u64 __seq = 0; > > + this_cpu = smp_processor_id(); > for_each_possible_cpu(__cpu) > - __seq += per_cpu(discard_pa_seq, __cpu); > + if (this_cpu != __cpu) > + __seq += per_cpu(discard_pa_seq, __cpu); > return __seq; > } > > -- > 2.25.4 > -- Jan Kara <jack@xxxxxxxx> SUSE Labs, CR