On 2021/03/25 11:15, Shin'ichiro Kawasaki wrote: > When fio repeats same workload on zoned block devices, zbd_file_reset() > is called for each repetition. This function resets target zones when > one of two conditions are met: 1) the write pointer of the zone has > offset from the device start unaligned to block size, or 2) the workload > is verify and verification is not in process. When the workload runs > with block size not a divisor of the zone size, the offsets of write > pointers from device start (not from zone start) become unaligned to > block size, then zbd_file_reset() resets target zones. This zone reset > happens even when the asynchronous IOs are in-flight and causes > unexpected IO results. Especially if write requests are in-flight, they > fail with unaligned write command error. A single thread may do both the > zone reset and the write request submit, recursive zone locks can not > prevent the zone reset during the writes. > > The write pointer check for block size alignment is not working as > intended. It should have checked offset not from device start but from > zone start. Having said that, regardless of this write pointer check > correctness, the zone reset is not required since the zones are reset in > zbd_adjust_block() anyway when remainder of the zone between write > pointer and zone end is smaller than block size. > > To avoid the zone reset during asynchronous IOs, do not reset zones in > zbd_file_reset() when the write pointer offset from the device start is > unaligned to block size. Modify zbd_file_reset() to call the helper > function zbd_reset_zones() only when the workload is verify and > verification is not in process. The function zbd_reset_zones() had an > argument 'all_zones' to inform that the zones should be reset when its > write pointer is unaligned to block size. This argument is not required. > Remove it and simplify the function accordingly. > > The zone reset for verify workloads is still required. It does not > conflict with asynchronous IOs, since fio waits for IO completion at > verification end, then IOs are not in-flight when zbd_file_reset() is > called for repetition after verification. > > Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@xxxxxxx> > --- > zbd.c | 23 +++++++---------------- > 1 file changed, 7 insertions(+), 16 deletions(-) > > diff --git a/zbd.c b/zbd.c > index d16b890f..eed796b3 100644 > --- a/zbd.c > +++ b/zbd.c > @@ -842,16 +842,13 @@ static void zbd_close_zone(struct thread_data *td, const struct fio_file *f, > * @f: fio file for which to reset zones > * @zb: first zone to reset. > * @ze: first zone not to reset. > - * @all_zones: whether to reset all zones or only those zones for which the > - * write pointer is not a multiple of td->o.min_bs[DDIR_WRITE]. > */ > static int zbd_reset_zones(struct thread_data *td, struct fio_file *f, > struct fio_zone_info *const zb, > - struct fio_zone_info *const ze, bool all_zones) > + struct fio_zone_info *const ze) > { > struct fio_zone_info *z; > const uint32_t min_bs = td->o.min_bs[DDIR_WRITE]; > - bool reset_wp; > int res = 0; > > assert(min_bs); > @@ -864,16 +861,10 @@ static int zbd_reset_zones(struct thread_data *td, struct fio_file *f, > if (!z->has_wp) > continue; > zone_lock(td, f, z); > - if (all_zones) { > - pthread_mutex_lock(&f->zbd_info->mutex); > - zbd_close_zone(td, f, nz); > - pthread_mutex_unlock(&f->zbd_info->mutex); > - > - reset_wp = z->wp != z->start; > - } else { > - reset_wp = z->wp % min_bs != 0; > - } > - if (reset_wp) { > + pthread_mutex_lock(&f->zbd_info->mutex); > + zbd_close_zone(td, f, nz); > + pthread_mutex_unlock(&f->zbd_info->mutex);> + if (z->wp != z->start) { > dprint(FD_ZBD, "%s: resetting zone %u\n", > f->file_name, zbd_zone_nr(f, z)); > if (zbd_reset_zone(td, f, z) < 0) > @@ -996,8 +987,8 @@ void zbd_file_reset(struct thread_data *td, struct fio_file *f) > * writing any data to avoid that a zone reset has to be issued while > * writing data, which causes data loss. > */ > - zbd_reset_zones(td, f, zb, ze, td->o.verify != VERIFY_NONE && > - td->runstate != TD_VERIFYING); > + if (td->o.verify != VERIFY_NONE && td->runstate != TD_VERIFYING) > + zbd_reset_zones(td, f, zb, ze); > zbd_reset_write_cnt(td, f); > } > > Looks good. Reviewed-by: Damien Le Moal <damien.lemoal@xxxxxxx> -- Damien Le Moal Western Digital Research