Yes, the bits around line 845 show the min() being used. It seems get_start_offset() is used for the first IO and subsequent IO's, making things difficult. I believe the correct fix here would be to set the ba specific to the io type. The ba parameter allows for read/write/trim alignments. So if io_u->ddir==DDIR_READ then we could just align to o->ba[DDIR_READ]. But I don't see how we would access io_u at this point, we don't know of the potential io_u at this time? It could be a mixed read/write/trim. Lacking that, brute force would suggest: if (fio_option_is_set(o, ba)) { align_bs = (unsigned long long) o->ba[DDIR_READ]; if(o->ba[DDIR_READ] != o->ba[DDIR_WRITE]) align_bs = (unsigned long long) o->ba[DDIR_READ] * (unsigned long long) o->ba[DDIR_WRITE]; if(align_bs != (unsigned long long) o->ba[DDIR_TRIM]) align_bs = align_bs * (unsigned long long) o->ba[DDIR_TRIM]; But I see another problem in that o->ba[DDIR_READ] is not set for sequential workloads. In fixup_options() we have: if (!o->ba[DDIR_READ] || !td_random(td)) o->ba[DDIR_READ] = o->min_bs[DDIR_READ]; if (!o->ba[DDIR_WRITE] || !td_random(td)) o->ba[DDIR_WRITE] = o->min_bs[DDIR_WRITE]; if (!o->ba[DDIR_TRIM] || !td_random(td)) o->ba[DDIR_TRIM] = o->min_bs[DDIR_TRIM]; I don't follow that code, other than if sequential, set ba to min_bs? If we remove that code and use above change, we can get the starting LBA to be aligned to the ba in the case of fio --name=test_job --ioengine=libaio --direct=1 --rw=read --iodepth=1 --size=100% --bs=4k --filename=/dev/nvme1n1 --number_ios=8 --offset=50% --log_offset=1 --write_iops_log=test_job --ba=8k # cat test_job_iops.1.log 0, 1, 0, 4096, 1600315899904 It seems to be a hack, so I didn't create a patch for it. Would like to better understand what I'm missing before breaking something. Thanks. Regards, Jeff -----Original Message----- From: Sitsofe Wheeler [mailto:sitsofe@xxxxxxxxx] Sent: Friday, October 20, 2017 2:01 PM To: Jeff Furlong <jeff.furlong@xxxxxxx> Cc: fio@xxxxxxxxxxxxxxx Subject: Re: fio offset with ba Hi, On 20 October 2017 at 20:08, Jeff Furlong <jeff.furlong@xxxxxxx> wrote: > > I don't quite follow the logic in the calculate offset function. The offset parameter recently allows a percentage. Suppose we set it to 50% and want to block align the IO's starting at 50% of device capacity, then block aligned to 8KB. > > # fio -version > fio-3.1-60-g71aa > > # blockdev --getsize64 /dev/nvme1n1 > 3200631791616 > > # fio --name=test_job --ioengine=libaio --direct=1 --rw=read > --iodepth=1 --size=100% --bs=4k --filename=/dev/nvme1n1 --runtime=1s > --offset=50% --log_offset=1 --write_iops_log=test_job --ba=8k > > # cat test_job_iops.1.log > 0, 1, 0, 4096, 1600315895808 > 0, 1, 0, 4096, 1600315899904 > 0, 1, 0, 4096, 1600315904000 > 0, 1, 0, 4096, 1600315908096 > > So we can see the device has 3200631791616 bytes, 50% of which is > 1600315895808 bytes, which happens to be 4KB aligned, but not 8KB > aligned. Even though we set the --ba=8k parameter, the offset LBA as > logged in the iops.1.log shows Hmm I see the same problem with this job: fio --name=test_job --ioengine=null --rw=read --iodepth=1 --size=3200631791616 --bs=4k --number_ios=1 --offset=50% --ba=8k --debug=io [...] io 15013 fill_io_u: io_u 0x236ad80: off=1600315895808/len=4096/ddir=0io 15013 /test_job.0.0io 15013 I think your guess about only impacting random I/O is probably right because fio --name=test_job --randrepeat=0 --ioengine=null --rw=randread --iodepth=1 --size=3200631791616 --bs=4k --number_ios=1 --offset=50% --ba=8k --debug=io picks offsets that are 8k aligned. > 4KB alignment. Does --ba work for all IO's or only random IO's? If all, does get_start_offset() control the raw offset value? I don't see why the min(ba, bs) is used in the calculation, but perhaps I am missing something. Thanks. Where is min(ba, bs) done - do you mean the bits around https://github.com/axboe/fio/commit/89978a6b26f81bdbd63228e2e2a86f604ee46c56#diff-4abbf037246dd2e450dc3f6a2ac77180R845? I agree you probably want to take the maximum of all the block alignments but what if one of the smaller ones is not a multiple of the largest one? Would you like to propose a patch? -- Sitsofe | http://sucs.org/~sits/ ��.n��������+%������w��{.n�������^n�r������&��z�ޗ�zf���h���~����������_��+v���)ߣ�