On Wed, Apr 27, 2022 at 05:08:25AM +0000, Shinichiro Kawasaki wrote: > The conditions to recreate the I/O timeout error are as follows: > > - Larger size of QEMU ZNS drive (10GB) > - I use QEMU ZNS drives with 1GB size for my test runs. With this smaller > size, the I/O timeout is not observed. > > - Issue zone reset command for all zones (with 'blkzone reset' command) just > after zbd/005 completion to the drive. > - The test case zbd/006 calls the zone reset command. It's enough to repeat > zbd/005 and zone reset command to recreate the I/O timeout. > - When 10 seconds sleep is added between zbd/005 run and zone reset command, > the I/O timeout was not observed. > - The data write pattern of zbd/005 looks important. Simple dd command to > fill the device before 'blkzone reset' did not recreate the I/O timeout. > > I dug into QEMU code and found that it takes long time to complete zone reset > command with all zones flag. It takes more than 30 seconds and looks triggering > the I/O timeout in the block layer. The QEMU calls fallocate punch hole to the > backend file for each zone, so that data of each zone is zero cleared. Each > fallocate call is quick but between the calls, 0.7 second delay was observed > often. I guess some fsync or fdatasync operation would be running and causing > the delay. > > In other words, QEMU ZNS zone reset for all zones is so slow depending on the > ZNS drive's size and status. Performance improvement of zone reset is desired in > QEMU. I will seek for the chance to work on it. Awesome find Shinichiro! Luis