Mike Snitzer <snitzer@xxxxxxxxxx> writes: Mike, > But while discussing this effort with Jeff Moyer he asked: shouldn't the > zeroed blocks be provisioned? This is a fairly embarassing question not > to be able to answer in the moment. So I clearly need to learn what the > overall intent of WRITE_ZEROES actually is. The intent is to guarantee all future reads to these blocks will return zeroes. Whether to allocate or deallocate or do a combination to achieve that goal is up to the device. > If it is meant as a replacement for WRITE_SAME (as hch switched dm-io > over from WRITE_SAME with a payload of 0 to WRITE_ZEROES) and for the > backing mechanism for blkdev_issue_zeroout() then I guess I have my > answer. Yes. I was hoping MD would use WRITE SAME to initialize data and parity drives. That's why I went with SAME nomenclature rather than ZEROES (which had just appeared in NVMe at that time). Christoph's renaming is mostly to emphasize the purpose and the semantics. People keep getting confused because both REQ_DISCARD and REQ_WRITE_SAME use the WRITE SAME SCSI command. But the two uses are completely orthogonal and governed by different heuristics in sd.c. > Unless DM thinp can guarantee that the discarded blocks will > always return zeroes (by zeroing before all partial block writes) my > discard based dm-thinp implementation of WRITE_ZEROES is a complete > throw-away (unless block zeroing is enabled.. which it never is because > performance sucks with it). So if an upper-level of the IO stack > (e.g. ext4) were to assume that a block will _definitely_ have zeroes > then DM thinp would fall short. You don't have a way to mark those blocks as being full of zeroes without actually writing them? Note that the fallback to a zeroout command is to do a regular write. So if DM doesn't zero the blocks, the block layer is going to it. -- Martin K. Petersen Oracle Linux Engineering