Re: fio offset with ba

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hmm,

I see a few choices for trying to solve alignment of the offset and
forced alignment of sequential blocks:
* Introduce an "offset_align" option solely for aligning the "initial"
offset and stop trying to use blockalign in the offset. Document it
well to stop confusion with blockalign.
* Document blockalign as ONLY working for random I/O or
* Do blockalign for sequential I/O on a per I/O basis (potentially
creating gappy I/O). This is also the request from
https://github.com/axboe/fio/issues/341 . Technically this is has a
big overlap with zoned I/O and the existing "generated offset" used in
rw option but perhaps it's just different enough to be justified?

Thoughts?

On 21 October 2017 at 01:13, Jeff Furlong <jeff.furlong@xxxxxxx> wrote:
> Yes, the bits around line 845 show the min() being used.  It seems get_start_offset() is used for the first IO and subsequent IO's, making things difficult.  I believe the correct fix here would be to set the ba specific to the io type.  The ba parameter allows for read/write/trim alignments.  So if io_u->ddir==DDIR_READ then we could just align to o->ba[DDIR_READ].  But I don't see how we would access io_u at this point, we don't know of the potential io_u at this time?  It could be a mixed read/write/trim.
>
> Lacking that, brute force would suggest:
>
>                 if (fio_option_is_set(o, ba)) {
>                         align_bs = (unsigned long long) o->ba[DDIR_READ];
>                         if(o->ba[DDIR_READ] != o->ba[DDIR_WRITE])
>                                 align_bs = (unsigned long long) o->ba[DDIR_READ] * (unsigned long long) o->ba[DDIR_WRITE];
>                         if(align_bs != (unsigned long long) o->ba[DDIR_TRIM])
>                                 align_bs = align_bs * (unsigned long long) o->ba[DDIR_TRIM];
>
> But I see another problem in that o->ba[DDIR_READ] is not set for sequential workloads.  In fixup_options() we have:
>
>         if (!o->ba[DDIR_READ] || !td_random(td))
>                 o->ba[DDIR_READ] = o->min_bs[DDIR_READ];
>         if (!o->ba[DDIR_WRITE] || !td_random(td))
>                 o->ba[DDIR_WRITE] = o->min_bs[DDIR_WRITE];
>         if (!o->ba[DDIR_TRIM] || !td_random(td))
>                 o->ba[DDIR_TRIM] = o->min_bs[DDIR_TRIM];
>
> I don't follow that code, other than if sequential, set ba to min_bs?  If we remove that code and use above change, we can get the starting LBA to be aligned to the ba in the case of
>
> fio --name=test_job --ioengine=libaio --direct=1 --rw=read --iodepth=1 --size=100% --bs=4k --filename=/dev/nvme1n1 --number_ios=8 --offset=50% --log_offset=1 --write_iops_log=test_job --ba=8k
>
> # cat test_job_iops.1.log
> 0, 1, 0, 4096, 1600315899904
>
> It seems to be a hack, so I didn't create a patch for it.  Would like to better understand what I'm missing before breaking something.
>
> Thanks.
>
> Regards,
> Jeff
>
>
> -----Original Message-----
> From: Sitsofe Wheeler [mailto:sitsofe@xxxxxxxxx]
> Sent: Friday, October 20, 2017 2:01 PM
> To: Jeff Furlong <jeff.furlong@xxxxxxx>
> Cc: fio@xxxxxxxxxxxxxxx
> Subject: Re: fio offset with ba
>
> Hi,
>
> On 20 October 2017 at 20:08, Jeff Furlong <jeff.furlong@xxxxxxx> wrote:
>>
>> I don't quite follow the logic in the calculate offset function.  The offset parameter recently allows a percentage.  Suppose we set it to 50% and want to block align the IO's starting at 50% of device capacity, then block aligned to 8KB.
>>
>> # fio -version
>> fio-3.1-60-g71aa
>>
>> # blockdev --getsize64 /dev/nvme1n1
>> 3200631791616
>>
>> # fio --name=test_job --ioengine=libaio --direct=1 --rw=read
>> --iodepth=1 --size=100% --bs=4k --filename=/dev/nvme1n1 --runtime=1s
>> --offset=50% --log_offset=1 --write_iops_log=test_job --ba=8k
>>
>> # cat test_job_iops.1.log
>> 0, 1, 0, 4096, 1600315895808
>> 0, 1, 0, 4096, 1600315899904
>> 0, 1, 0, 4096, 1600315904000
>> 0, 1, 0, 4096, 1600315908096
>>
>> So we can see the device has 3200631791616 bytes, 50% of which is
>> 1600315895808 bytes, which happens to be 4KB aligned, but not 8KB
>> aligned.  Even though we set the --ba=8k parameter, the offset LBA as
>> logged in the iops.1.log shows
>
> Hmm I see the same problem with this job:
> fio --name=test_job --ioengine=null --rw=read --iodepth=1
> --size=3200631791616 --bs=4k --number_ios=1 --offset=50% --ba=8k --debug=io
>
> [...]
> io       15013 fill_io_u: io_u 0x236ad80:
> off=1600315895808/len=4096/ddir=0io       15013 /test_job.0.0io
> 15013
>
> I think your guess about only impacting random I/O is probably right because fio --name=test_job --randrepeat=0 --ioengine=null --rw=randread
> --iodepth=1 --size=3200631791616 --bs=4k --number_ios=1 --offset=50% --ba=8k --debug=io
>
> picks offsets that are 8k aligned.
>
>> 4KB alignment.  Does --ba work for all IO's or only random IO's?  If all, does get_start_offset() control the raw offset value?  I don't see why the min(ba, bs) is used in the calculation, but perhaps I am missing something.  Thanks.
>
> Where is min(ba, bs) done - do you mean the bits around https://github.com/axboe/fio/commit/89978a6b26f81bdbd63228e2e2a86f604ee46c56#diff-4abbf037246dd2e450dc3f6a2ac77180R845?
> I agree you probably want to take the maximum of all the block alignments but what if one of the smaller ones is not a multiple of the largest one?
>
> Would you like to propose a patch?
>
> --
> Sitsofe | http://sucs.org/~sits/

-- 
Sitsofe | http://sucs.org/~sits/
--
To unsubscribe from this list: send the line "unsubscribe fio" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Kernel]     [Linux SCSI]     [Linux IDE]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux SCSI]

  Powered by Linux