Re: ZBC/FLEX FIO addition ideas

Phillip Chen <phillip.a.chen@xxxxxxxxxxx> · Fri, 23 Mar 2018 11:30:48 -0600

Hello Bart,
I've been trying out the ZBC changes with a few drives and workloads,
and the only limitation I've found so far is that random write
workloads with a queue depth > 1 seem to write beyond the write
pointer. This isn't particularly surprising, but I wanted to double
check if this is an expected limitation, and if so, are there plans to
support queued random writes in the future? Other than that, the
changes seem to be working just as advertised. I think your zone
management system will make writing a zone random workload fairly
straightforward. Thank you for the update!

Sitsofe, would you be interested in being sent a ZBC drive? If you
would like to give me an address, we can get one sent to you to help
with validation.
Let me know what you'd like,
Phillip

On Mon, Mar 19, 2018 at 8:21 PM, Bart Van Assche <Bart.VanAssche@xxxxxxx> wrote:
> On Sat, 2018-03-17 at 07:55 +0000, Sitsofe Wheeler wrote:
>> I don't have access to any ZBC devices but I looked over the patch.
>> This seems to be a blend of both both of Philip's suggestions and I
>> quite like it because it centralized most of the logic within the
>> engine itself bar the offset fixing.
>
> Thanks for having taken a look :-)
>
>> Something that could help is
>> if ioengines were able to specify a callbacks to say whether an offset
>> + size is valid or whether the ioengine wants to "fix" them. If it
>> says the offset is invalid we just quietly pretend we did the I/O,
>> reduce the amount of I/O we have to do on this loop and generate the
>> next offset + size. If it chooses to just fix it then we just use the
>> different values it fixed the I/O to (assuming it fits alignment,
>> minimum block sizes etc). How do people see fix mode interacting with
>> things like bssplit? Do we just ban bssplit when in this mode for now?
>>
>> Most engines could leave these unset but it would be perfect for this
>> one and it would stop the typical path having to depend on this
>> particular ioengine being configured in. If fix things up mode is set
>> then I suppose fio could then go on to ban things like lsfr, randommap
>> and random_distribution. I guess one snag with re-rolling in "valid"
>> mode is that you could end up re-rolling forever. I'm no expert but
>> I'm hoping doing another if (engine defines block_checking) on every
>> I/O won't have too much of a hit.
>>
>> Perhaps we just start with "fix" mode see how that goes and move on
>> from there? Something else that could get hairy are verify jobs that
>> are seperate from their write job because the offsets generated during
>> a subsequent verify will not necessarily be invalid at the same points
>> or changed the same way. I suppose inline verify should work assuming
>> write regions haven't been reset due to becoming full.
>
> It's not clear to me why you would like to add such logic in the I/O
> engines? Shouldn't such logic rather be added in a new I/O profile such that
> users who want to run ZBC tests have the freedom of chosing an I/O engine?
>
> I'm not sure that a "fix" mode would work best. Phillip Chen has
> mentioned that we need a mode in which the user can control the number of
> open zones. If a random blocks would be generated first and the open zone
> limit would be applied afer that offset has been generated then most of
> the generated random blocks would have to be discarded since the number
> of open zones is typically much smaller than the number of zones
> supported by a disk. I'm currently looking at how to let I/O profiles
> influence the behavior of get_next_block() in such a way that the number
> of open zones can be limited by an I/O profile without having to reimplement
> fio core logic in an I/O profile.
>
> Regarding implementing write verify: the code that I posted does not yet
> support verifying written data. However, it's not that hard to modify that
> code such that verifying written data becomes possible. My proposal is to
> use the following approach:
> - Before writing starts, reset the zones that fio will write to to avoid
>   that a zone has to be reset in the middle of the write phase, something
>   that would result in data loss.
> - In a single zone, write at the write pointer. This will result in
>   sequential writes per zone from the start to the end of the zone.
> - When verifying data, read the zone from start to end such that read and
>   write offsets match.
>
> Thanks,
>
> Bart.
>
>
>
--
To unsubscribe from this list: send the line "unsubscribe fio" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html