On Mon, Jan 04, 2021 at 09:57:48PM +0200, Avi Kivity wrote: > > I don't have a strong opinion on it. A complex userland application can > > do a bit better job managing queue depth etc, but otherwise I suspect > > doing the IO from kernel will win by a small bit. And the queue-depth > > issue presumably would be relevant for write-zeroes as well, making me > > lean towards just using the fallback. > > > > The new flag will avoid requiring DMA to transfer the entire file size, and > perhaps can be implemented in the device by just adjusting metadata. So > there is potential for the new flag to be much more efficient. We already support a WRITE_ZEROES operation, which many (but not all) NVMe devices and some SCSI devices support. The blkdev_issue_zeroout helper can use those, or falls back to writing actual zeroes. XFS already has a XFS_IOC_ALLOCSP64 that is defined to actually allocate written extents. It does not currently use blkdev_issue_zeroout, but could be changed pretty trivially to do so. > But note it will need to be plumbed down to md and dm to be generally > useful. DM and MD already support mddev_check_write_zeroes, at least for the usual targets.