Re: [PATCH 0/5] zbd: drop 'sectors with data' accounting

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Jan 30, 2023 / 10:33, Niklas Cassel wrote:
> On Mon, Jan 30, 2023 at 11:28:45AM +0900, Shin'ichiro Kawasaki wrote:
> > When zonemode=zbd is specified, fio does 'sectors with data' accounting to
> > record the total number of sectors that have been written on a zoned block
> > device. This accounting has two issues:
> > 
> > 1) The current implementation counts sectors with data per job, over the IO
> >    range of the job. So for a workload with multiple jobs with overlapping IO
> >    ranges, the number of sectors with data is overestimated as written sectors
> >    common to multiple jobs are counted multiple times.
> > 
> > 2) Counting the total number of written sectors requires taking the zone lock of
> >    all zones in a job IO range. For a multi-job workload with overlapping IO
> >    ranges, this often leads to significant zone lock contention, resulting in
> >    some jobs starting doing IOs only after other jobs have completed their work
> >    (which looks like an apparent deadlock on startup).
> > 
> > This series addresses the issues by dropping the 'sectors with data' accounting.
> > The accounting is used only for two features: 1) randrw first IO direction
> > decision and 2) zone_reset_threshold ratio check. The first two patches modify
> > these two features to not rely on the 'sectors with data' accounting. The third
> > patch drops the 'sectors with data' accounting. The last two patches adjust test
> > cases and an fio example script for the zone_reset_threshold.
> 
> Hello Shin'ichiro,

Hi Niklas, thanks for the comments. I have to admit that your points are valid.
I'm not sure know how often the option is used.

> 
> I understand when using multiple jobs with overlapping IO ranges,
> the number of sectors with data was overestimated.
> However, patch 5/5 contains a single job.
> 
> My expectation for this fio test file is thus that the
> zone_reset_threshold should stay the same.
> 
> I understand that you might not like the definition of the current option.
> But can you just change the definition so that it is unambiguous?
> 
> (Regardless if you change the definition to be per job, or per device,
> a test case with only one job should be able to keep the same value
> as before in the test case.)
> 
> 
> 
> I understand that you change the way that the accounting works, but
> I don't think that we should just totally change the definition of an
> existing option just because we think it should have been defined in
> another way.
> 
> Can't you:
> -Change the accounting
> -Clarify the definition of the option, but keep it like it is,
>  regardless if it causes zone lock contention or not.

In case we would keep the current definition (accounting per job), still we can
avoid the wrong accounting and the zone lock contention by checking write range
overlap. If the option is specified together with multiple jobs with overlapping
write ranges, fio can error out so that users can know that the option does not
work as expected. However, I am reluctant to take this approach since its use
cases are limited (no write range overlap) and it adds 8 bytes to the struct
fio_file only for the option.

> -Implement a new option that might be more optimal, and does not
>  cause zone lock contention?

This sounds good to me. The new option name can be zone_reset_threshold_per_dev.

> -Potentially deprecate or remove the old zone_reset_threshold option.

Yes, if we introduce the new option, I prefer to remove the old option.

> 
> 
> I guess my biggest problem is that users might use this option,
> and when upgrading fio, that option will silently behave totally
> different.
> 
> If we remove the option, the user will get an error that the option
> does no longer exist, which is good, because then they will understand
> that they need to update their fio job/config files.
> 
> TL;DR:
> Basically, I would prefer to rename the option instead of silently
> changing its meaning. (Because users might not notice.)
> 
> 
> Kind regards,
> Niklas

-- 
Shin'ichiro Kawasaki



[Index of Archives]     [Linux Kernel]     [Linux SCSI]     [Linux IDE]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux SCSI]

  Powered by Linux