Re: [PATCH 3/7] zbd: introduce per-device "max_open_zones" limit

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Alexey,

On 2020/05/02 3:52, Alexey Dobriyan wrote:
> On Fri, May 01, 2020 at 01:34:32AM +0000, Damien Le Moal wrote:
>> On 2020/04/30 21:41, Alexey Dobriyan wrote:
>>> It is not possible to maintain equal per-thread iodepth. The way code
>>> is written, "max_open_zones" acts as a global limit, and one thread
>>> opens all "max_open_zones" for itself and others starve for available
>>> zones and _exit_ prematurely.
>>>
>>> This config is guaranteed to make equal number of zone resets/IO now:
>>> each thread generates identical pattern and doesn't intersect with other
>>> threads:
>>>
>>> 	zonemode=zbd
>>> 	zonesize=...
>>> 	rw=write
>>>
>>> 	numjobs=N
>>> 	offset_increment=M*zonesize
>>>
>>> 	[j]
>>> 	size=M*zonesize
>>>
>>> Patch introduces "global_max_open_zones" which is per-device config
>>> option. "max_open_zones" becomes per-thread limit. Both limits are
>>> checked for each open zone so one thread can't starve others.
>>
>> It makes sense. Nice one.
>>
>> But the change as is will break existing test scripts (e.g. lots of SMR drives
>> are being tested with this).
> 
> It won't break single-threaded ones, that's for sure.

Yes, but things like:

fio --ioengine=psync --rw=randwr --max_open_zones=128 --numjobs=32

will change behavior. With your change, instead of 32 threads writing randomly
to a total of 128 zones, you will get 32 threads each writing randomly to 128
zones, with a total of 32*128=4096 zones.

SMR drives and zonemode=zbd have now been around for a while and there are a lot
of fio scripts deployed in production for system validation/tests, as well as in
drive development for testing. If we can avoid breaking that, we absolutely must.

My proposal to keep max_open_zones as the per device maximum and introducing a
thread_max_open_zones limit keeps backward compatibility with existing scripts
while still allowing your change.

> 
>> I think we can avoid this breakage simply: leave
>> max_open_zones option definition as is and add "job_max_open_zones" or
>> "thread_max_open_zones" option (no strong feelings about the name here, as long
>> as it is explicit) to define the per thread maximum number of open zones. This
>> new option could actually default to max_open_zones / numjobs if that is not 0.
> 
> I'd argue that such scripts are broken.

See the above example. It is a perfectly valid script, not broken at all.
Varying the number of max_open_zones allows measuring the performance variation
of a drive with the number of implicitly open zones. It is a common one that I
have seen a lot in drive development and production. There are likely other
valid ones too. Assuming that all current uses of max_open_zones with multi-jobs
workloads are broken would be a mistake.

> 
> If sustained numjobs*max_open_zones QD is desired than it is not
> guaranteed as threads will simply exit at indeterminate times,
> which break LBA space coverage as well.
> 
> Right now, numjobs= + max_open_zones= means "max open zones by at most
> "numjobs" threads.

I understand that. And we should keep it that way for the reasons mentioned
above. Modifying your change with the option thread_max_open_zones will nicely
enhance. E.g.

fio --ioengine=libaio --iodepth=8 --rw=randwr --thread_max_open_zones=1 --numjobs=8

Will result in 8 threads writing a single randomly chosen zone at QD=8. And that
is the same as your proposed:

fio --ioengine=libaio --iodepth=8 --rw=randwr --max_open_zones=1 --numjobs=8

but without breaking the existing meaning of max_open_zones as a per drive/file
limit.

I totally agree with your change. It is a nice one. But let's preserve
max_open_zones meaning as the per device limit. No need to change it.

Best regards.

-- 
Damien Le Moal
Western Digital Research






[Index of Archives]     [Linux Kernel]     [Linux SCSI]     [Linux IDE]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux SCSI]

  Powered by Linux